While Model Trains

Read data blog posts.
Carefully handpicked.
Presented 3 at a time.

The Simpsons by the Data

Todd W. Schneider

An analysis of the first 27 seasons of The Simpsons, featuring great plots and memes. The analysis covers the most significant side characters, the presence of a pattern of patriarchy, declining TV ratings, and highlights some of the most relevant sentences.

Read it!

Understanding the beta distribution (using baseball statistics)

David Robinson

"The beta distribution is best for representing a probabilistic distribution of probabilities- the case where we don’t know what a probability is in advance, but we have some reasonable guesses."

Read it!

Data Analysis at the Command Line

@lucytalksdata

Using csvkit, grep, gnuplot, and other command-line tools to perform data analysis directly from the command line.

Read it!