dirty cat: count vectors
When you're working with scikit-learn you'll often need to deal with categorical data. The way you deal with this type of data really matters. In this series of videos we'll explore a the dirty-cat while we try to deal with categorical data.
If you want to play around with count vectors you can run the code below.
from sklearn.feature_extraction.text import CountVectorizer cv = CountVectorizer().fit(ml_df['employee_position_title']) cv.transform(ml_df['employee_position_title']).shape
You can also inspect the vocabulary.
Feedback? See an issue? Something unclear? Feel free to mention it here.
If you want to be kept up to date, consider signing up for the newsletter.