The way data is preprocessed can have a huge effect in your scikit-learn pipelines. This series of videos will highlight common techniques for preprocessing data for modelling.
import numpy as np import pandas as pd import matplotlib.pylab as plt df = pd.read_csv("drawndata1.csv") X = df[['x', 'y']].values y = df['z'] == "a" plt.scatter(X[:, 0], X[:, 1], c=y);
To see the effect from the standard scaler you need to run this;
from sklearn.preprocessing import QuantileTransformer X_new = QuantileTransformer(n_quantiles=100).fit_transform(X) plt.scatter(X_new[:, 0], X_new[:, 1], c=y);
Feedback? See an issue? Feel free to mention it here.
If you want to be kept up to date, consider getting the newsletter.