scikit prep:
poly
The way data is preprocessed can have a huge effect in your scikit-learn pipelines. This series of videos will highlight common techniques for preprocessing data for modelling.
Notes
You can download the dataset here.
Here's what the original dataset looks like.
df = pd.read_csv("drawndata2.csv")
X = df[['x', 'y']].values
y = df['z'] == 'a'
plt.scatter(X[:, 0], X[:, 1], c=y);
Here's the PolynomialFeatures
at work.
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
pipe = Pipeline([
("scale", PolynomialFeatures()),
("model", LogisticRegression())
])
pred = pipe.fit(X, y).predict(X)
plt.scatter(X[:, 0], X[:, 1], c=pred);
Feedback? See an issue? Something unclear? Feel free to mention it here.
If you want to be kept up to date, consider signing up for the newsletter.