... birthday problem: conclusion


Here's the code that makes the plot appear.

import pandas as pd

df = pd.read_csv("birthdays.csv")

plot_df = (df
  .assign(date = lambda d: pd.to_datetime(d['date']))
  .assign(day_of_year = lambda d: d['date'].dt.dayofyear)
  .agg(n_births=('births', 'sum'))
  .assign(p = lambda d: d['n_births']/d['n_births'].sum()))

plot_df.assign(p_fake = lambda d: 1/d.shape[0])[['p', 'p_fake']].plot()

We hope you enjoyed the little thought experiment.

If you want to download the entire notebook, feel free to grab it here.

Feedback? See an issue? Something unclear? Feel free to mention it here.

If you want to be kept up to date, consider signing up for the newsletter.