Build data flywheels and active learning loops

Powerful Components for Seamless Integration

LightlyOne Worker

A docker container running on your GPU that does all the processing.

Python SDK

Integrate with other frameworks and create selection jobs using scripts.

LightlyOne Platform

Powerful API and UI that gives instant access to the selected data.

Example of a Python script to do diversity sampling

from lightly.api import ApiWorkflowClient
from lightly.openapi_generated.swagger_client import DatasetType
from lightly.openapi_generated.swagger_client import DatasourcePurpose

# Create the Lightly client to connect to the Lightly Platform.
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN")

# Create a new dataset on the Lightly Platform.
client.create_dataset(
   dataset_name="dataset-name",
   dataset_type=DatasetType.IMAGES,
)

# Connect the dataset to your S3 bucket.
client.set_s3_config(
   resource_path="s3://bucket/input/",
   region="eu-central-1",
   access_key="S3-ACCESS-KEY",
   secret_access_key="S3-SECRET-ACCESS-KEY",
   purpose=DatasourcePurpose.INPUT,
)
client.set_s3_config(
   resource_path="s3://bucket/lightly/",
   region="eu-central-1",
   access_key="S3-ACCESS-KEY",
   secret_access_key="S3-SECRET-ACCESS-KEY",
   purpose=DatasourcePurpose.LIGHTLY,
)

# Schedule a Lightly Worker run.
client.schedule_compute_worker_run(
   worker_config={
       "enable_training": True,
   },    selection_config={
       "n_samples": 50,
       "strategies": [
           {
               "input": {
                   "type": "EMBEDDINGS"
               },
               "strategy": {
                   "type": "DIVERSITY"
               }
           }
       ]
   }
)

Automatically select data that matters

Embeddings

Selection based on similar / diverse images

Metadata

Selection based on metadata such as location, weather and more

Predictions

Selection based on predictions and probabilities

Why should I use LightlyOne?

Save costs on building and scaling in-house solutions. Leverage a suite of powerful and evaluated selection algorithms. Reduce deployment cycles and data labeling costs by finding the most relevant training data.

Feedback cycle

Create a feedback cycle from your production data to improve your training data. Use LightlyOne to select data to cover new use cases and to prevent data drift for existing ones.

Easy to use

All you need to use LightlyOne is your data locally or in cloud storage and a machine with optionally a GPU.
Our Python SDK allows for easy integration into your existing ML stack within a few hours.

Automate your data pipeline

LightlyOne automates your data pipeline, processing tens of millions of samples daily without manual intervention. Simplify your data curation and selection process while ensuring high-quality training data for your models.

Explore Our

Unmatched Features

Discover the unique features that will lift your machine learning pipeline to the next level.

Supported Data types

Images

Sequences

Videos

Selection (Active Learning)

Embeddings

Metadata

Predictions

Features

Integrations

Dashboard

Automation

and more ...

Data Privacy

Data such as images are streamed directly from your connected cloud storage to the LightlyOne Worker or the LightlyOne Platform. Since both components are run on the client side, your data never leaves your environment.

Additional Security

We provide additional SSO/2FA and SLAs for our enterprise customers. Contact us to learn more about how LightlyOne complies with SOC2, HIPAA, and GDPR.

Lightly is fully ISO 27001 certified

Improve your data

Today is the day to get the most out of your data. Share our mission with the world — unleash your data's true potential.