logo

... scikit prep.


The way data is preprocessed can have a huge effect in your scikit-learn pipelines. This series of videos will highlight common techniques for preprocessing data for modelling.


Notes

The base use of the OneHotEncoder is demonstrated below.

import numpy as np
from sklearn.preprocessing import OneHotEncoder

arr = np.array(["low", "low", "high", "medium"]).reshape(-1, 1)
enc = OneHotEncoder(sparse=False, handle_unknown='ignore')
enc.fit_transform(arr)

Because we have set handle_unknown="ignore" we can run this line of code without causing an error.

enc.transform([["zero"]])

Feedback? See an issue? Something unclear? Feel free to mention it here.

If you want to be kept up to date, consider getting the newsletter.