logo

... scikit dummy.


You need to properly benchmark your models and it can be easy to forget to do this step. That is why we wanted to demonstrate the dummy module of scikit-learn. It is an underappreciated part of the library that is very useful to start with.


Notes

Let's run a dummy model in a gridsearch so we can compare with our original result.

pipe = Pipeline([
    ('scale', StandardScaler()),
    ('model', DummyClassifier())
])

grid = GridSearchCV(estimator=pipe, 
                    param_grid={'model__strategy': ['stratified', 'most_frequent', 'uniform']}, 
                    cv=5, 
                    scoring={'acc': make_scorer(accuracy_score)}, 
                    refit='acc', 
                    return_train_score=True)

grid.fit(X, y);

To see the gridsearch results you can run;

pd.DataFrame(grid.cv_results_)[['param_model__strategy', 'mean_test_acc']]

Feedback? See an issue? Something unclear? Feel free to mention it here.

If you want to be kept up to date, consider getting the newsletter.