Label Studio
The most flexible data labeling platform to fine-tune LLMs, prepare training data or validate AI models.
How to run it
Data Label Studio can help you annotate data with the use of AI - in that case, you can use GPUs. For automatic annotation, and streamlining the training process checkout ML backend section.
You also would like to create and mount volume for your data and postgress database to secure your work.
Create volume
$ cgc volume create --name label-studio-data -s 5
$ cgc volume create --name label-studio-db -s 1
Create database
$ cgc db create --name label-studio-db -c 1 -m 2 -v label-studio-db postgresql
Run Label Studio
<APP_TOKEN_DB>
can be found using cgc db list -d
command.
$ cgc compute create --name label-studio01 -c 8 -m 24 -v label-studio-data label-studio -d postgre_host=label-studio-db -d postgre_password=<APP_TOKEN_DB>
After the app is created, you can login into web interface based on information provided in the output.
URL and app_token
can be found using cgc compute list -d
command.
Admin login name is admin@localhost
How to use Label Studio
Usage is really simple. If you know what it takes to label your data, exploration should not take more then 10 mins without any documentation. If you need help please visit the official documentation
How to use ML Backend in Label Studio
An ML backend is a tool that integrates machine learning models into the data annotation process. It assists by using existing models to pre-annotate data, which accelerates the workflow by providing initial labels for human annotators to review and refine.
How to setup
Connect the model to your project
After you create a project, open the project settings
in the top right corner and select Machine Learning
.
Click Add Model
and complete the following fields:
Field | Description |
---|---|
Title | Enter a name for the model. |
Backend URL | Enter a URL for the model. |
Description (Optional) | Provide description of the model. |
Use for interactive preannotations | Enable this option to allow the model to assist with the labelling process by providing real-time predictions or suggestions as annotators work on tasks. For more information, see Interactive pre-annotations. |
Allow version auto-update | Allow to automatically pre-label images with predictions. |
Pre-annotations/predictions
Get predictions from a model
After you connect a model to Label Studio, you can see model predictions in the labeling interface if the model is pre-trained, or right after it finishes training (remember to allow version auto-update).
For a large dataset, the HTTP request to retrieve predictions might be interrupted by a timeout.
Available models
The following models are supported by ML backend.
- Pre-annotation column indicates if the model can be used for pre-annotation in Label Studio:
you can see pre-annotated data when opening the labeling page or after running predictions for a batch of data. - Training column indicates if the model can be used for training in Label Studio: update the model state based the submitted annotations.
MODEL_NAME | Task | Pre-annotation | Training |
---|---|---|---|
YOLO | Object detection. | ✅ | ✅ |
Segment Anything (SAML) | Object segmentation. | ✅ | ❌ ? |
For more information on ML Backend you can visit the official documentation.