Back to TILs.

Calmcode TIL

Fancy Diagrams logoFancy Diagrams

Scikit-Learn recently implemented a new diagram that can render your pipelines. To activate it, you'll need to run:

from sklearn import set_config


To repeat what's mentioned on the official docs, here's an elaborate example;

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LogisticRegression
from sklearn import set_config

steps = [
    ("standard_scaler", StandardScaler()),
    ("polynomial", PolynomialFeatures(degree=3)),
    ("classifier", LogisticRegression(C=2.0)),

pipe = Pipeline(steps)

When you set the config and evaluate the pipe-variable ...



... you'll get something that looks like this.

Note, you can click around there!


The diagrams can display settings, nested pipelines and custom components too! Suppose we take a pipeline from scikit-lego docs:

from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklego.preprocessing import ColumnSelector

feature_pipeline = Pipeline([
    ("datagrab", FeatureUnion([
         ("discrete", Pipeline([
             ("grab", ColumnSelector("diet")),
             ("encode", OneHotEncoder(categories="auto", sparse=False))
         ("continuous", Pipeline([
             ("grab", ColumnSelector("time")),
             ("standardize", StandardScaler())

pipe = Pipeline([
    ("transform", feature_pipeline),
    ("model", LinearRegression())

Here's what the pipeline looks like:

If you want to get the best results with these custom components, we recommend making sure that you build on top of the BaseEstimator that comes from scikit-learn when you construct your custom classes. That way, the parameters will render nicely. An example of an implementation can be found on the scikit-lego GitHub repo.

Back to TILs.