Skip to main content

Label Studio

The most flexible data labeling platform to fine-tune LLMs, prepare training data or validate AI models.

How to run it

Data Label Studio can help you annotate data with the use of AI - in that case, you can use GPUs. For automatic annotation, and streamlining the training process checkout ML backend section.

You also would like to create and mount volume for your data and postgress database to secure your work.

Create volume

$ cgc volume create --name label-studio-data -s 5
$ cgc volume create --name label-studio-db -s 1

Create database

$ cgc db create --name label-studio-db -c 1 -m 2 -v label-studio-db postgresql

Run Label Studio

info

<APP_TOKEN_DB> can be found using cgc db list -d command.

$ cgc compute create --name label-studio01 -c 8 -m 24 -v label-studio-data label-studio -d postgre_host=label-studio-db -d postgre_password=<APP_TOKEN_DB>

After the app is created, you can login into web interface based on information provided in the output.
URL and app_token can be found using cgc compute list -d command.
Admin login name is admin@localhost

How to use Label Studio

Usage is really simple. If you know what it takes to label your data, exploration should not take more then 10 mins without any documentation. If you need help please visit the official documentation

How to use ML Backend in Label Studio

An ML backend is a tool that integrates machine learning models into the data annotation process. It assists by using existing models to pre-annotate data, which accelerates the workflow by providing initial labels for human annotators to review and refine.

How to setup

Connect the model to your project

After you create a project, open the project settings in the top right corner and select Machine Learning.

Click Add Model and complete the following fields:

FieldDescription
TitleEnter a name for the model.
Backend URLEnter a URL for the model.
Description (Optional)Provide description of the model.
Use for interactive preannotationsEnable this option to allow the model to assist with the labelling process by providing real-time predictions or suggestions as annotators work on tasks. For more information, see Interactive pre-annotations.
Allow version auto-updateAllow to automatically pre-label images with predictions.

Pre-annotations/predictions

Get predictions from a model

After you connect a model to Label Studio, you can see model predictions in the labeling interface if the model is pre-trained, or right after it finishes training (remember to allow version auto-update).

Warning

For a large dataset, the HTTP request to retrieve predictions might be interrupted by a timeout.

Available models

The following models are supported by ML backend.

  • Pre-annotation column indicates if the model can be used for pre-annotation in Label Studio:
    you can see pre-annotated data when opening the labeling page or after running predictions for a batch of data.
  • Training column indicates if the model can be used for training in Label Studio: update the model state based the submitted annotations.
MODEL_NAMETaskPre-annotationTraining
YOLOObject detection.
Segment Anything (SAML)Object segmentation.❌ ?

For more information on ML Backend you can visit the official documentation.