Binomial distribution pops up in our problems daily, given that the number of occurrences of events with probability in a sequence of size can be described as Question that naturally arises in this context is - given observations of and , how do we estimate ? One might say that simply computing should be enough, since...

## Handling missing data in K-Means

One of the challenging things related to building "big data" apps is dealing with messy data sets. At SupplyFrame, we ran into a problem while doing some analysis with K-Means clustering: All interesting features in our data had varying amounts of missing values. It turns out that how the values are missing is significant! Say...