Author: <span>Leon Torres</span>

Monitoring WAL lag in PostgreSQL 9.x

When using streaming replication in PostgreSQL 9, it's important to know what the latency is between the master and slaves, especially when deploying on cloud based instances. Ideally, we'd like to know by how many bytes the WAL logs are lagging.  PostgreSQL offers a neat way to check just that between a given slave and its...


Handling missing data in K-Means

One of the challenging things related to building "big data" apps is dealing with messy data sets. At SupplyFrame, we ran into a problem while doing some analysis with K-Means clustering:  All interesting features in our data had varying amounts of missing values.  It turns out that how the values are missing is significant!  Say...