While Model Trains

Read data blog posts.
Carefully handpicked.
Presented 3 at a time.

Are Pop Lyrics Getting More Repetitive?

Colin Morris

A fascinating visual essay that utilizes the Lempel-Ziv algorithm (which powers GIFs, PNGs, and most archive formats) to analyze if pop songs are becoming more repetitive.

Read it!

How to calculate shapley values from scratch

Tobias Sterbak

Understanding Shapley value by implementing it from scratch.

Read it!

In Praise of Small Data

Evan Miller

Unless you are training a model with thousands of parameters, big data should not be seen as a source of value, but rather as a source of cost.

Read it!