... patsy: lego


To use scikit-lego you'll need to install it first;

pip install scikit-lego

You can now use it in the pipeline.

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

from sklego.preprocessing import PatsyTransformer

import matplotlib.pylab as plt

X = (df_clean
    .loc[lambda d: d['n_born'] > 2000]
    .assign(num_date = lambda d: date_to_num(d['date'])))
y = X['n_born']

pipe = Pipeline([
    ("patsy", PatsyTransformer("(cc(yday, df=12) + wday + num_date)**2")),
    ("scale", StandardScaler()),
    ("model", LinearRegression())

np.mean(np.abs(pipe.fit(X, y).predict(X) - y))

The scikit-lego documentation for this can be found here.

Feedback? See an issue? Something unclear? Feel free to mention it here.

If you want to be kept up to date, consider signing up for the newsletter.