The Abacus and The Canvas

Luca Cazzanti's blog

The Data Analysis Hype Cycle

Gartner Hype Cycle You may be familiar with Gartner's hype cycle, which illustrates the ups and downs of a new technology as it progressesvalong its journey from inception to maturity, or ...

Confidence Intervals for Gamma Distribution Parameters

Gammas I use visualization and bootstrap to obtain empirical confidence intervals for the estimates of the $k$ parameter of a Gamma distribution. I consider the dependency of the estimates on the sample size and on the true value of $k$, and the differences in the estimates computed with the method-of-moment (MoM ...

Know your questions!

Question mark Articulating precisely the question you are asking of the data gives clarity and focus to your data analysis. A complication is that often a data analysis flows through different stages in a non-linear way: you have some preliminary assumptions, you explore the data, you revise the assumptions, then notice a ...

Merging Sensor Data Streams with Python Generators and Priority Queues

Vessel data from two sensors.A recurring task in multi-sensor data processing is merging -- or interleaving -- data from multiple sensors while maintaining chronological order. For example, you may be combining temperature readings from different weather stations in a region, stringing together the locations of mobile devices recorded by different cell towers, or piecing together the ...

Similarity Discriminant Analysis

Similarity Discriminant Analysis (SDA) is a generative framework for classifying objects based on their pairwise similarity. I developed SDA with Prof. Maya Gupta for my Ph.D. I am posting here the Matlab code for that work for posterity. It's a set of Matlab scripts for similarity discriminant analysis ...