A while ago I was listening to a talk about uncertainty. The speaker broke out
uncertainty into three different types: randomness, incompleteness, and inconsistency. It made me
think about how these three ...

# The Data Analysis Hype Cycle

You may be familiar with Gartner's hype cycle, which illustrates the ups and downs of a new technology as it progressesvalong its journey from inception to maturity, or ...

# Confidence Intervals for Gamma Distribution Parameters

I use visualization and bootstrap to obtain empirical confidence intervals for the estimates of the $k$ parameter of a Gamma distribution. I consider the dependency of the estimates on the sample size and on the true value of $k$, and the differences in the estimates computed with the method-of-moment (MoM ...

# Know your questions!

Articulating precisely the question you are asking of the data gives clarity and focus to your data analysis. A complication is that often a data analysis flows through different stages in a non-linear way: you have some preliminary assumptions, you explore the data, you revise the assumptions, then notice a ...

# Merging Sensor Data Streams with Python Generators and Priority Queues

A recurring task in multi-sensor data processing is merging -- or interleaving -- data from multiple sensors while maintaining chronological order. For example, you may be combining temperature readings from different weather stations in a region, stringing together the locations of mobile devices recorded by different cell towers, or piecing together the ...

# Similarity Discriminant Analysis

Similarity Discriminant Analysis (SDA) is a generative framework for classifying objects based on their pairwise similarity. I developed SDA with Prof. Maya Gupta for my Ph.D. I am posting here the Matlab code for that work for posterity. It's a set of Matlab scripts for similarity discriminant analysis ...