patsy:
categories
There are many ways to get data from pandas to scikit-learn but when you're hacking in a notebook you may prefer to have something that is expressive. Like a domain specific grammar. The tool patsy offers exactly this by mocking features from the R language.
Notes
Here's the formula that the video ends with;
import patsy as ps
y, X = ps.dmatrices("n_born ~ wday + yday - 1 + C(month)", df_clean)
print(X[:5])
Feedback? See an issue? Something unclear? Feel free to mention it here.
If you want to be kept up to date, consider signing up for the newsletter.