The first trick revolves around using the .predict_proba()
method as
a proxy for model confidence. These proba values don't actually serve
as a proper measure for confidence, but they might be good enough to
generate short-lists of items to double-check. For more information on
model confidence we recommend reading this blogpost.
Proba
To generate proba values, you can take a pretrained pipeline and run;
pipe.predict_proba(X)
# array([[0.81905624, 0.18094376],
# [0.87339587, 0.12660413],
# [0.99887526, 0.00112474],
# ...,
# [0.95765091, 0.04234909],
# [0.89402035, 0.10597965],
# [0.97989268, 0.02010732]])
This gives us a two dimenional array with two columns (one for each class). Since each row needs to sum up to one, we can take a single column to check how certain the model is in it's prediction.
# make predictions
probas = pipe.predict_proba(X)[:, 0]
# use predictions in hindsight, note that
# probas.shape[0] == df.shape[0]
(df
.loc[(probas > 0.45) & (probas < 0.55)]
[['text', 'excitement']]
.head(7))
By running this, you'll find the example "OMG THOSE TINY SHOES! desire to boop snoot intensifies" which is wrongly labelled as not excitement.