While Model Trains

Read data blog posts.
Carefully handpicked.
Presented 3 at a time.

Understanding the beta distribution (using baseball statistics)

David Robinson

"The beta distribution is best for representing a probabilistic distribution of probabilities- the case where we don’t know what a probability is in advance, but we have some reasonable guesses."

Read it!

Effective testing for machine learning systems.

Jeremy Jordan

"Effective testing for machine learning systems requires both a traditional software testing suite (for model development infrastructure) and a model testing suite (for trained models)."

Read it!

In Praise of Small Data

Evan Miller

Unless you are training a model with thousands of parameters, big data should not be seen as a source of value, but rather as a source of cost.

Read it!