Back to labs main.

Calmcode Labs Presents

doubtlab logodoubtlab.

Our 8th experiment involves doubtful annotations.

Bad labels are pretty common. Just have a look at the labelerrors project and the bad labels course on calmcode. After stumbling over label errors ourselves in the past we figured it was time to host a suite of tools that might help you find bad labels, which led to the creation of doubtlab.

Reasons of Doubt

Doubtlab is a collection of tools that let you define reasons to doubt you labels. For example, when you're dealing with a classification problem, you could assign doubt to label when:

  • it is part of an example that seems to be an outlier
  • when a model doesn't assign a high confidence to any label
  • when a model estimates a high confidence for the wrong label
  • when a model estimates a low confidence for the given label
  • two models disagree on what prediction to give it

All of these "reasons" are sensible reasons to double-check a label. The library supports all of these, and more, but the real power comes from combining them. After all, when three reasons of doubt are triggered, it deserves to get more attention compared to only one reason being triggred.


Doubtlab allows you to declare reasons of doubt which you can then combine in a doubt-ensemble. A demonstration of the syntax is shown below.

from sklearn.linear_model import LogisticRegression

from doubtlab.ensemble import DoubtEnsemble
from doubtlab.reason import ProbaReason, WrongPredictionReason

# Let's say we have some dataset/model already
X, y = load_some_dataset(return_X_y=True)
model = LogisticRegression(max_iter=1_000), y)

# Next we can add reasons for doubt. In this case we're saying
# that examples deserve another look if the associated proba values
# are low or if the model output doesn't match the associated label.
reasons = {
    'proba': ProbaReason(model=model),
    'wrong_pred': WrongPredictionReason(model=model)

# Pass these reasons to a doubtlab instance.
doubt = DoubtEnsemble(**reasons)

# Get the ordered indices of examples worth checking again
indices = doubt.get_indices(X, y)
# Get dataframe with "reason"-ing behind the sorting
predicates = doubt.get_predicates(X, y)

This predicates dataframe is a sorted dataframe with all the reasons of doubt listed. For more details on how this works internally be sure to check the Quickstar Guide.

Learn more

To learn more about the library, be sure to check the Github page and the documentation

If you're looking for a detailed example on how to use doubtlab, make sure you check the Google Emotions example in the docs or this tutorial from Explosion on YouTube. There's also a presentation of the tool over at this open-source spotlight from DataTalksClub.

We hope that you'll give this library a spin and that it helps improve your data quality.

Back to labs main.