The way data is preprocessed can have a huge effect in your scikit-learn pipelines. This series of videos will highlight common techniques for preprocessing data for modelling.

The base use of the OneHotEncoder is demonstrated below.

import numpy as np
from sklearn.preprocessing import OneHotEncoder

arr = np.array(["low", "low", "high", "medium"]).reshape(-1, 1)
enc = OneHotEncoder(sparse=False, handle_unknown='ignore')

Because we have set handle_unknown="ignore" we can run this line of code without causing an error.


