logo

... patsy.


There are many ways to get data from pandas to scikit-learn but when you're hacking in a notebook you may prefer to have something that is expressive. Like a domain specific grammar. The tool patsy offers exactly this by mocking features from the R language.


Notes

Here's the example that creates all the interaction terms.

import patsy as ps
import numpy as np

def date_to_num(date_col):
    return (date_col - date_col.min()).dt.days

y, X = ps.dmatrices("n_born ~ (date_to_num(date) + yday + month)**3 - month", df_clean)

Feedback? See an issue? Something unclear? Feel free to mention it here.

If you want to be kept up to date, consider getting the newsletter.