Back to TILs.

Calmcode TIL

Fancy Diagrams logoFancy Diagrams

Scikit-Learn recently implemented a new diagram that can render your pipelines. To activate it, you'll need to run:

from sklearn import set_config

set_config(display="diagram")

To repeat what's mentioned on the official docs, here's an elaborate example;

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LogisticRegression
from sklearn import set_config

steps = [
    ("standard_scaler", StandardScaler()),
    ("polynomial", PolynomialFeatures(degree=3)),
    ("classifier", LogisticRegression(C=2.0)),
]

pipe = Pipeline(steps)

When you set the config and evaluate the pipe-variable ...

set_config(display="diagram")

pipe

... you'll get something that looks like this.

Note, you can click around there!

Benefits

The diagrams can display settings, nested pipelines and custom components too! Suppose we take a pipeline from scikit-lego docs:

from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklego.preprocessing import ColumnSelector


feature_pipeline = Pipeline([
    ("datagrab", FeatureUnion([
         ("discrete", Pipeline([
             ("grab", ColumnSelector("diet")),
             ("encode", OneHotEncoder(categories="auto", sparse=False))
         ])),
         ("continuous", Pipeline([
             ("grab", ColumnSelector("time")),
             ("standardize", StandardScaler())
         ]))
    ]))
])

pipe = Pipeline([
    ("transform", feature_pipeline),
    ("model", LinearRegression())
])

Here's what the pipeline looks like:

If you want to get the best results with these custom components, we recommend making sure that you build on top of the BaseEstimator that comes from scikit-learn when you construct your custom classes. That way, the parameters will render nicely. An example of an implementation can be found on the scikit-lego GitHub repo.


Back to main.