ray logo ray: introduction

1 2 3 4 5
Notes

Python typically runs all your code on a single core. Even when the program that you're running has components that can easily run in parallel. To make programs faster in parallel scenarios you might want to explore ray. It's not the only tool for this use-case but it's a tool we've come to like.

The base simulation will run with the code below.

import numpy as np

def birthday_experiment(class_size, n_sim=1000):
    """Simulates the birthday paradox. Vectorized = Fast!"""
    sims = np.random.randint(1, 365 + 1, (n_sim, class_size))
    sort_sims = np.sort(sims, axis=1)
    n_uniq = (sort_sims[:, 1:] != sort_sims[:, :-1]).sum(axis = 1) + 1
    return {"est_prob": np.mean(n_uniq != class_size)}

You can time how long this takes.

%%time
results = [birthday_experiment(class_size=size, n_sim=10_000) for size in range(2, 100)]

Note that the %%time command is a noisy measurement.