Skip to main content

Label Studio

The most flexible data labeling platform to fine-tune LLMs, prepare training data or validate AI models.

How to run it

Data Label Studio can help you annotate data with the use of AI - in that case, you can use GPUs. For automatic annotation, and streamlining the training process checkout ML backend section.

You also would like to create and mount volume for your data and postgress database to secure your work.

Create volume

$ cgc volume create label-studio-data -s 5
$ cgc volume create label-studio-db -s 1

Create database

$ cgc db create --name label-studio-db -c 1 -m 2 -v label-studio-db postgresql

Run Label Studio

info

<APP_TOKEN_DB> can be found using cgc db list -d command.

$ cgc compute create --name label-studio01 -c 8 -m 24 -v label-studio-data label-studio -d postgre_host=label-studio-db -d postgre_password=<APP_TOKEN_DB>

After the app is created, you can login into web interface based on information provided in the output.
URL and app_token can be found using cgc compute list -d command.
Admin login name is admin@localhost

How to use Label Studio

Usage is really simple. If you know what it takes to label your data, exploration should not take more then 10 mins without any documentation. If you need help please visit the official documentation

How to use ML Backend in Label Studio

An ML backend is a tool that integrates machine learning models into the data annotation process. It assists by using existing models to pre-annotate data, which accelerates the workflow by providing initial labels for human annotators to review and refine.

Connect the model to your project

After you create a project, open the project settings in the top right corner and select Model.

Click Connect Model and complete the following fields:

FieldDescription
TitleEnter a name for the model.
Backend URLEnter a URL for the model.
Description (Optional)Provide description of the model.
Use for interactive preannotationsEnable this option to allow the model to assist with the labelling process by providing real-time predictions or suggestions as annotators work on tasks. For more information, see Interactive pre-annotations.

Pre-annotations/predictions

Get predictions from a model

After you connect a model to Label Studio, you can see model predictions in the labeling interface if the model is pre-trained, or right after it finishes training.

Warning

For a large dataset, the HTTP request to retrieve predictions might be interrupted by a timeout.

Available models

The following models are supported by ML backend.

  • Pre-annotation column indicates if the model can be used for pre-annotation in Label Studio:
    you can see pre-annotated data when opening the labeling page or after running predictions for a batch of data.
  • Training column indicates if the model can be used for training in Label Studio: update the model state based the submitted annotations.
MODEL_NAMETaskPre-annotationTraining
YOLOObject detection.
Segment Anything (SAML)Object segmentation.

Set up custom model in your project

You can find an instruction on how to load your custom model to ML Backend here.

When creating a project in Label Studio, start by selecting the desired labelling template during the Labelling Setup step. For detection models, choose the "Object Detection with Bounding Boxes" template. After selecting the template, add your label names in the Visual mode.
Next, switch to the Code mode and add the model_path attribute to the control tag you're using (e.g., RectangleLabels in this case). This will link your custom model to the automated labelling process.

note

If your model is located directly in the root of the volume (e.g., <volume_name>/<your-model>.pt), you can simply specify the model's filename in the model_path, like so: model_path="<model_name>.pt". However, if the model is stored within a subdirectory of the volume, you will need to provide the full path to the model file (e.g., model_path="<subdirectory_name>/<model_name>.pt").

You can also provide model_score_threshold to filter out predictions based on their confidence scores (default: model_score_threshold="0.5").

After completing these steps, your configuration should look like this:

<View>
<Image name="image" value="$image"/>
<RectangleLabels
name="label" toName="image"
model_path="my_custom_model.pt"
model_score_threshold="0.6">
<Label value="Cat"/>
<Label value="Dog"/>
</RectangleLabels>
</View>

If you already have created a project and want to connect custom model, go to the project Settings and select Labelling Interface. Here you can configure your model using steps provided above.

note

When labelling an image, make sure to enable the Auto-annotation button located at the bottom of the window to receive automated annotations from the model. Additionally, you can choose to enable Auto-accept Suggestions for a more streamlined workflow.

  • If you enable Auto-accept Suggestions, the annotation regions will appear automatically and be created immediately without further input.
  • If you do not enable Auto-accept Suggestions, the suggested regions will still appear, but you will have the option to manually approve or reject them, either individually or all at once.

Feel free to experiment with these options, and remember that you can adjust them at any time to suit the needs of your project.

For more information on ML Backend you can visit the official documentation.