birthday problem logo birthday problem: conclusion

1 2 3 4 5 6
Notes

Here's the code that makes the plot appear.

import pandas as pd

df = pd.read_csv("birthdays.csv")

plot_df = (df
  .assign(date = lambda d: pd.to_datetime(d['date']))
  .assign(day_of_year = lambda d: d['date'].dt.dayofyear)
  .groupby('day_of_year')
  .agg(n_births=('births', 'sum'))
  .assign(p = lambda d: d['n_births']/d['n_births'].sum()))

plot_df.assign(p_fake = lambda d: 1/d.shape[0])[['p', 'p_fake']].plot()
plt.ylim(0);

We hope you enjoyed the little thought experiment.

If you want to download the entire notebook, feel free to grab it from the github repository.