You can play around with the dirt_cat
settings below.
import dirty_cat
mod = dirty_cat.SimilarityEncoder(categories='most_frequent', n_prototypes=200)
mod.fit_transform(data[['employee_position_title']]).shape
dirty cat: similarity
You can play around with the dirt_cat
settings below.
import dirty_cat
mod = dirty_cat.SimilarityEncoder(categories='most_frequent', n_prototypes=200)
mod.fit_transform(data[['employee_position_title']]).shape