- 7 min read

Configure and Use Unitlab CLI and Python SDK: Efficient Dataset Management for AI and ML Projects

Manage your projects and datasets efficiently with Python!

Configure and Use Unitlab CLI and Python SDK: Efficient Dataset Management for AI and ML Projects
COCO Image Segmentation | Unitlab Annotate

The Unitlab CLI and Python SDK are a versatile tools designed to streamline image annotation, image labeling, and data annotation workflows by enabling programmatic interaction with Unitlab Annotate.

The Unitlab CLI is a powerful, Python-based tool designed to manage your projects and datasets directly from your command line. This tool is ideal for developers and data scientists working on computer vision projects who prefer an efficient, scriptable interface over traditional GUI workflows.

While the Unitlab CLI is a command-line based utility to manage one-time commands, the SDK allows you to manage AI datasets, projects, and annotations through Python code, providing a robust dataset management solution for ML datasets.

We have also written a post about 7 tips of automating your image annotation workflows, one of which is the usage of the programmatic tools, Unitlab CLI and SDK.

7 Ways to Accelerate Your Image Labeling Process
Accelerate your image labeling workflows with 7 effective ways!

Accelerate Your Image Annotation Workflows | Unitlab Annotate

In this guide, we'll cover the installation, configuration, and usage of both the Unitlab CLI and Python SDK to simplify image labeling tasks and enhance your data annotation solutions.

💡
Curious? Check out our docs to explore Unitlab Annotate!

Install & Configure

To use the CLI and SDK, ensure you have the following:

Install

Install the unitlab package from PyPI.

pip install --upgrade unitlab
unitlab --help

If the package is successfully installed, you will see this output:

Unitlab CLI installed successfully | Unitlab Annotat

API Key Configuration

To interact with the image annotation solution programmatically, you need an Unitlab API key. Retrieve your API key from Settings > API Key in your Unitlab dashboard or access your account page.

Get Your API Key | Unitlab Annotate

Configure the API key in your terminal:

unitlab configure --api-key YOUR_API_KEY

This saves the key in ~/.unitlab/credentials for future use. You can override it with the --api-key flag if needed.

Unitlab CLI configured successfully | Unitlab Annotate

Unitlab CLI Commands

Project list

To retrieve a list of current your current projects, execute:

unitlab project list

This command shows the essential information about your projects: Project ID, Name, AI Model, Number of Images, Annotation Progress, Review Progress, and Created Date of each project in your workspace.

Project List through Unitlab CLI | Unitlab Annotate

Commands

Project list

To retrieve a list of current your current projects, execute:

unitlab project list

This command shows the essential information about your projects: Project ID, Name, AI Model, Number of Images, Annotation Progress, Review Progress, and Created Date of each project in your workspace.

Project List through Unitlab CLI | Unitlab Annotate

Upload data to your project

Instead of manually clicking and uploading data to your project through the GUI in the web platform, you can programmatically upload many images with just one command in the CLI.

Run this command to upload the folder to your project, where PROJECT_ID is the project you want to add data, and DIRECTORY_PATH is the absolute path to your source data.

unitlab project upload PROJECT_ID --directory DIRECTORY_PATH
Unitlab CLI uploaded images successfully | Unitlab Annotate

Dataset list

Datasets are the annotated, public datasets that are available in Unitlab Annotate. These datasets include both user-generated datasets and official datasets by Unitlab Annotate. You can use them as a starting point for your projects and training data. To see the list of available datasets and your own private ones, run this command:

unitlab dataset list

This command shows the essential information about each dataset:

  • Dataset ID - the unique ID of the dataset
  • Name - the name of the dataset
  • Version - the version of the dataset
  • Annotation Type - the annotation type of the images in the dataset
  • Number of data - the number of images in the dataset
  • Download formats - the download formats of the dataset, such as COCO
  • Public - is the dataset public or your private dataset?
Dataset List through Unitlab CLI | Unitlab Annotate
🎓
Unitlab offers many official and public datasets for free!

Download dataset

You can download the annotations of your datasets in the JSON format to your local computer by using this command:

unitlab dataset download DATASET_ID --download-type annotation --export-type ANNOTATION_TYPE

If you want to download the source images, you can use this command:

unitlab dataset download DATASET_ID --download-type files
Unitlab CLI downloaded dataset annotations successfully | Unitlab Annotate

Additional

These are the most common uses of the Unitlab CLI. You can use other commands as well, as detailed in our docs, to speed up your data annotation workflows.

Unitlab CLI (and SDK) is one component of the many features that Unitlab Annotate offers. Try many advanced image labeling tools and sophisticated features for streamlining your data annotation process.

Unitlab Python SDK

Initialize the SDK

Once configured, initialize the SDK in Python:

from unitlab import UnitlabClient

# You can find your API key at https://app.unitlab.ai/Unitlab/api-keys
api_key = "YOUR_API_KEY_HERE"
client = UnitlabClient(api_key)

Remember to keep your API key to Unitlab secret!

Key Methods

The Unitlab SDK offers a range of features to help with data annotation services, image labeling, and dataset version control. Below are some frequently used methods:

projects                    Get a list of projects.
project                     Get project information.
project_members             Get project's members.
project_upload_data         Upload data samples to a project.
datasets                    Get a list of available datasets.
dataset_upload              Create a dataset with your own annotations.
dataset_download            Download the dataset's annotation.
dataset_update              Add more data to an existing dataset.
dataset_download_files      Download raw dataset files

These methods are accessible through your initialized object; in our case it is the client variable.

Examples

These are some of the common examples of Unitlab Python SDK.

Retrieve Project List

from unitlab import UnitlabClient

api_key = "YOUR_API_KEY_HERE"

client = UnitlabClient(api_key)
client.projects()

This command will return a list of all your projects, including the pk (Project ID), ai_model, name, number_of_data, annotator progress, reviewer progress, creator, and created date:

[
    {
        "pk": "6c8801e5-7ffd-4dd4-81a2-190e1792c154",
        "ai_model": "Person Detection",
        "annotator_progress": 91.10,
        "created": "2024-02-22T11:29:48.723802",
        "creator": "YOUR_EMAIL_ADDRESS",
        "name": "Person Detection",
        "number_of_data": 988,
        "reviewer_progress": 67.81
    },
    {
        "pk": "2ae53739-95db-4a22-b018-846b19f2da05",
        "ai_model": "Vehicle Segmentation",
        "annotator_progress": 98.51,
        "created": "2024-02-15T18:35:48.193787",
        "creator": "YOUR_EMAIL_ADDRESS",
        "name": "Vehicle Segmentation",
        "number_of_data": 1007,
        "reviewer_progress": 89.37
    },
    {
        "pk": "46960ada-3119-4caf-be8e-ca633dbf9a42",
        "ai_model": "Human Pose Detection",
        "annotator_progress": 100,
        "created": "2024-02-15T16:55:32.386444",
        "creator": "YOUR_EMAIL_ADDRESS",
        "name": "Person Polygon Detection",
        "number_of_data": 52,
        "reviewer_progress": 100
    },
    {
        "pk": "245784fb-3b63-49cd-b2d1-6494095c9bb5",
        "ai_model": "Fashion Segmentation",
        "annotator_progress": 88,
        "created": "2024-01-23T17:46:18.134930",
        "creator": "YOUR_EMAIL_ADDRESS",
        "name": "Stuff Segmentation",
        "number_of_data": 2000,
        "reviewer_progress": null
    },
    ...
]

View Project Members

from unitlab import UnitlabClient

api_key = "YOUR_API_KEY_HERE"
project_id = "YOUR_PROJECT_ID"

client = UnitlabClient(api_key)
client.project_members(project_id=project_id)

This code will return a list of a particular project's members with their pk (ID), email, position, progress, statistics:

[
    {
        "pk": "e1dd6cb9-1049-4ac1-ba44-41c54e8f7f9c",
        "email": "MEMBER_EMAIL",
        "position": "annotator",
        "progress": 82.13,
        "average_time": 10.45,
        "overall_time": 1692.0
    },
    {
        "pk": "e1dd6cb9-1049-4ac1-ba44-41c54e8f7f9c",
        "email": "MEMBER_EMAIL",
        "position": "annotator",
        "progress": 94.64,
        "average_time": 13.78,
        "overall_time": 2035.0
    },
    {
        "pk": "e1dd6cb9-1049-4ac1-ba44-41c54e8f7f9c",
        "email": "MEMBER_EMAIL",
        "position": "reviewer",
        "progress": 64.72,
        "average_time": 19.5,
        "overall_time": 1013.6
    },
    {
        "pk": "e1dd6cb9-1049-4ac1-ba44-41c54e8f7f9c",
        "email": "MEMBER_EMAIL",
        "position": "reviewer",
        "progress": 72.0,
        "average_time": 18.6,
        "overall_time": 1335.9
    }
	...
]
✅
Unitlab offers advanced project and member statistics! Learn more.

Add Data Samples to a Project

To upload more data to your existing project, provide the project ID (pk) and the path to your directory containing the data samples and run the following code. Use this for seamless data auto-annotation and to keep your projects up to date with new samples.

from unitlab import UnitlabClient

api_key = "YOUR_API_KEY_HERE"
project_id = "YOUR_PROJECT_ID"
directory = "PATH_TO_DATA_SAMPLES"

client = UnitlabClient(api_key)
client.project_upload_data(project_id=project_id, directory=directory)

List Available Datasets

This code snippet will return all public datasets, both official and user-generated, available in Unitlab Annotate.

from unitlab import UnitlabClient

api_key = "YOUR_API_KEY_HERE"

client = UnitlabClient(api_key=api_key)
client.datasets()

This code will return the pk (Dataset ID), name, version, annotation_type, download_formats of the dataset available at Unitlab Annotate.

[
    {
        "pk": "62ec02b2-98ea-4a28-87b5-cb401e2bc831",
        "name": "CArscr",
        "version": "0.8",
        "annotation_type": "Image Polygon",
        "number_of_data": 39,
        "download_formats": "COCO, YOLOv5, YOLOv8",
		"is_public": True
    },
    {
        "pk": "0e54980c-4d8c-403b-b0bc-ff71908fc323",
        "name": "Person and Face Detection",
        "version": "0.4",
        "annotation_type": "Image Bounding Box",
        "number_of_data": 7,
        "download_formats": "COCO, YOLOv5, YOLOv8",
		"is_public": True
    },
    {
        "pk": "4ae4652d-5fbf-4af3-8106-9a5a025e50ed",
        "name": "Limonmanzana",
        "version": "0.1",
        "annotation_type": "Image Semantic Segmentation",
        "number_of_data": 20,
        "download_formats": "COCO, YOLOv5, YOLOv8",
		"is_public": True
    },
    {
        "pk": "a7d5574d-115a-46f7-a117-be5e944dfd70",
        "name": "Segmentacao_FoFo",
        "version": "0.1",
        "annotation_type": "Image Semantic Segmentatio",
        "number_of_data": 58,
        "download_formats": "COCO, YOLOv5, YOLOv8",
		"is_public": True
    },
	...
]

Download dataset annotations

You can easily fetch annotated data in JSON format to integrate with your image annotation solution.. Retrieve your dataset ID and execute the dataset_download method to download the annotated results.

from unitlab import UnitlabClient

api_key = "YOUR_API_KEY_HERE"
dataset_id = "YOUR_DATASET_ID"

client = UnitlabClient(api_key)
client.dataset_download(dataset_id=dataset_id)

Download raw dataset files

By contrast, download the original files for dataset version control and model training below:

from unitlab import UnitlabClient

api_key = "YOUR_API_KEY_HERE"
dataset_id = "YOUR_DATASET_ID"

client = UnitlabClient(api_key)
client.dataset_download_files(dataset_id=dataset_id)

This will create a folder named dataset-files-{dataset_id} in your current working directory.

Conclusion

The Unitlab CLI and Python SDK provided by Unitlab Annotate offers a wealth of additional functions beyond those highlighted here. Users can utilize these methods according to their unique needs and preferences.

While the Unitlab CLI is ideal for quick tasks and scripts, the Unitlab Python SDK simplifies building an automated pipeline for easire dataset management and data labeling tasks.

For example, to address common manual problems, such as creating project statistics reports, fetching the latest version of the dataset, and uploading additional data for annotation into Unitlab can be easily automated with little scripts.