sleep: calculating differences
A university in Italy was doing research on the effect of sleep on programming performance. The question is, where do we draw the line? When is the difference in performance big enough that you can't say that it is due to chance? Also, can't we explain the effect with gpa?
This is the code we used in this notebook;
def reshuffle(dataf): return (dataf .sample(36) .reset_index(drop=True) .assign(sleep=lambda d: np.where(d.index < 15, 'deprived', 'normal'))) def calc_diff(dataf): agg = (dataf .groupby('sleep') .agg(mean_unit_tests=('passed_unit_tests', np.mean), mean_asserts=('passed_asserts', np.mean), mean_user_stories=('tackled_user_stories', np.mean),)).T return agg['deprived'] - agg['normal'] n = 1000 results = np.zeros((n, 3)) for i in range(n): results[i, :] = calc_diff(reshuffle(df)) df_diff = pd.DataFrame(results, columns=['diff_unit_tests', 'diff_asserts', 'diff_user_stories'])
Feedback? See an issue? Something unclear? Feel free to mention it here.
If you want to be kept up to date, consider signing up for the newsletter.