Using BaaL with Label Studio

By: Frédéric Branchaud-Charron (@Dref360)

In this tutorial, we will see how to use BaaL inside of Label Studio, a widely known labelling tool.

By using Bayesian active learning in your labelling setup, you will be able to label only the most informative examples. This will avoid labelling duplicates and easy examples.

This is also a good way to start the conversation between your labelling team and your machine learning team as they need to communicate early in the process!

We will built upon Label Studio’s Pytorch transfer learning example, so be sure to download it and try to run it before adding BaaL to it. The full example can be found here.

More info:


Installing BaaL

To install BaaL, you will need to add baal in the generated Dockerfile.

# Dockerfile
RUN pip install --no-cache \
                -r requirements.txt \
                uwsgi== \
                supervisor==4.2.2 \
                label-studio==1.0.2 \
                baal==1.3.0 \
                click==7.1.2 \

and when developing, you should install BaaL in your local environment.

pip install baal==1.3.0


The overall changes are pretty minor, so we will go step by step, specifying the class and method we are modifying. Again, the full script is available here.


The simplest way of doing Bayesian uncertainty estimation in active learning is MC-Dropout (Gal and Ghahramani, 2015) which requires Dropout layers. To use this, we use VGG-16 instead of the default ResNet-18.

from baal.bayesian.dropout import patch_module
# ImageClassifier.__init__
self.model = models.vgg16(pretrained=True)
last_layer_idx = 6
num_ftrs = self.model.classifier[last_layer_idx].in_features
self.model.classifier[last_layer_idx] = nn.Linear(num_ftrs, num_classes)
# Set Dropout layers for MC-Dropout
self.model = patch_module(self.model)

Next, we will wrap our model using baal.modelwrapper.ModelWrapper from BaaL which will simplify the different loops. If you use another framework, feel free to checkout our Pytorch Lightning integration and our HuggingFace integration.

# ImageClassifier.__init__
self.wrapper = ModelWrapper(self.model, self.criterion)

Training loop

We can simplify the training loop by using ModelWrapper.

NOTE: train now receives a instead of a Dataloader.

# ImageClassifier
def train(self, dataset, num_epochs=5):
    since = time.time()
    self.wrapper.train_on_dataset(dataset, self.optimizer, batch_size=32,
    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))

    return self.model


We can draw multiple predictions from the model’s parameter distribution using MC-Dropout. In this script we will make 20 predictions per example:

# ImageClassifier
def predict(self, image_urls):
    images = torch.stack([get_transformed_image(url) for url in image_urls])
    with torch.no_grad():
        return self.wrapper.predict_on_batch(images, iterations=20, cuda=use_cuda)

In ImageClassifierAPI we will leverage this set of predictions and BALD (Houlsby et al, 2013) to estimate the model’s uncertainty and to get the “average prediction” which would be more trustworthy:

# ImageClassifierAPI.predict

logits = self.model.predict(image_urls)
average_prediction = logits.mean(-1)
predicted_label_indices = np.argmax(average_prediction, axis=1)
# Get the uncertainty from the predictions.
predicted_scores = BALD().get_uncertainties(logits)

Launching LabelStudio

Following Label Studio tutorial, you can start your ML Backend as usual. In the Settings, do not forget to checkbox all boxes:

and to use active learning, order by Predictions score:

Labeling in action!

To test this setup, we imported in Label Studio a subset of MIO-TCD, a dataset that is similar to real production data. This dataset suffers from heavy class imbalance, the class car represents 90% of all images in the dataset.

After labelling randomly 100 images, I start training my model. On a subset of 10k unlabelled images, we get the following most uncertain predictions:




Articulated Truck



The model has seen enough cars, and wants to label new classes as they would be the most informatives. If we continue labelling, we will see a similar behavior, where the class car is undersampled and the others are oversampled.

In Atighehchian et al. 2019, we compare BALD to Uniform sampling on this dataset and we get better performance on underrepresented classes. In the image below, we have the F1 for two underrepresented classes:

In conlusion, we can now use Bayesian active learning in Label Studio which would help your labelling process be more efficient. Please do not hesitate to reach out on our Gitter or on Label Studio’s Slack if you have feedback or questions.