The way data is preprocessed can have a huge effect in your scikit-learn pipelines. This series of videos will highlight common techniques for preprocessing data for modelling.
The base use of the
OneHotEncoder is demonstrated below.
import numpy as np from sklearn.preprocessing import OneHotEncoder arr = np.array(["low", "low", "high", "medium"]).reshape(-1, 1) enc = OneHotEncoder(sparse=False, handle_unknown='ignore') enc.fit_transform(arr)
Because we have set
handle_unknown="ignore" we can run this
line of code without causing an error.
Feedback? See an issue? Something unclear? Feel free to mention it here.
If you want to be kept up to date, consider getting the newsletter.