📣 Big news: LightlyStudio is now live! Try it for free.

Customer Success Stories

How Harvard Medical School Researchers Use Lightly to Train a 3D CT Foundation Model

Lightly helped Harvard Medical School advance their 3D CT segmentation research by delivering a clean, extensible SSL workflow that improved representation quality and unified experiment setups across the lab.

Suraj Pai

Research Associate

Overview

Industry

Healthcare

Location

Boston, USA

Employee

<30

Get Started with Lightly

Talk to Lightly’s computer vision team about your use case.

Book a Demo

Products

LightlyTrain

Results

SSL Training Pipeline with DINOv2

Use Case

SSL for 3D medical imaging

About

The AI in Medicine Program at Mass General Brigham works on self-supervised and foundation models for medical imaging, with a focus on radiotherapy planning and 3D segmentation tasks.

Their work spans MRI, CT, and other volumetric modalities - domains where labeled data is scarce, 3D pipelines are complex, and robust pretrained models are still underdeveloped.

Problem

Medical imaging pipelines at the lab are highly heterogeneous: models must support full 3D volumes, non-standard intensity distributions, variable voxel spacing, and task-specific augmentation strategies.

Unlike 2D vision, there is no widely adopted pretrained backbone for 3D CT segmentation that works consistently across datasets.

This creates several technical constraints:

Existing 3D pretrained models tend to be dataset-specific rather than broadly generalizable.
Most architectures require full model fine-tuning, which is slow and expensive.
Public SSL repositories weren’t designed for 3D or medical pipelines, making adaptation difficult.
Augmentations from natural images (e.g., color jitter) are not meaningful for CT.

The team wanted to move beyond task-specific tuning and instead train a DINO-based CT foundation model that could serve multiple oncology use cases, ideally requiring only light downstream adaptation.

With five researchers leading SSL efforts inside a 25-person lab, they needed an implementation that was clean, modular, and easy for several PhD students to use consistently.

To keep experimentation consistent, they standardized on: MONAI for medical-imaging data handling, PyTorch Lightning for workflow orchestration, Lightly SSL for the DINOv2 implementation internal config system (“sparkwheel” for experiment management).

Testimonials

"It took far less work than expected to plug DINO into our SSL system - the LightlySSL code was clean and easy to adapt"

Suraj Pai

Research Associate

Solution

The team first experimented with Meta’s DINO repositories, but the code complexity and frequent upstream changes made collaboration difficult. LightlySSL became the natural choice, especially since the group had already used Lightly in earlier self-supervised projects for about four years, which meant it fit naturally into their workflow and was easy for the team to adopt.

Choosing LightlySSL meant a clear and well-structured DINOv2 workflow that worked seamlessly with MONAI and PyTorch Lightning. The implementation was straightforward to extend to 3D and provided a reproducible setup that fit cleanly into their existing pipeline. It also reduced coordination overhead across PhD students, who could now work from a shared, consistent configuration.

Adapting DINOv2 for CT

To make DINO suitable for volumetric CT, the team introduced CT-appropriate augmentations and extended the pipeline to support 3D inputs. Using Lightly, they were able to:

Replace color jitter with histogram-shift augmentations appropriate for CT
Add 3D affine transformations to capture anatomical invariance
Integrate SimCLR-style medical augmentations validated in earlier projects
Support volumetric patch extraction aligned with CT geometry
Incorporate the workflow into their internal experiment-management system

These modifications enabled the model to capture finer radiological structure than their previous masked autoencoder (MAE) based setup.

Results

With LightlySSL, the team built a DINOv2-based CT foundation model that demonstrated promising early results for segmentation and served as a stable baseline across several ongoing studies.

The new workflow enabled them to:

integrate DINOv2 into a 3D SSL pipeline previously limited to MAE
improve feature-level representations through volumetric augmentations,
establish a reproducible training setup shared across PhD researchers,
produce DINOv2 baselines requested by peer reviewers with minimal overhead,
support downstream projects such as vision-language models and SAM-style segmentation approaches.

The foundation model now underpins multiple research directions in the lab and accelerates internal experimentation by reducing engineering complexity. Check out the project on Github.

Get Started with Lightly

Talk to Lightly’s computer vision team about your use case.

Book a Demo

Testimonials

What engineers say after adopting Lightly

No fluff—just results from teams using Lightly to move faster with better data and models.

"We had millions of images but no clear way to prioritize. Manual selection was slow and full of guesswork. With Lightly, we just feed in the data and get back what’s actually worth labeling."

Carlos Alvarez

Machine Learning Engineer

We collect millions of road surface images, but since surface imagery is highly spatially correlated, labelling every sample is redundant, and finding sets of diverse data was a challenge.

Vijay Gill Hansted

Machine Learning Engineer

“The pretrained models were low in performance. The color scheme is probably the reason, they just don’t transfer well to ash-RGB. This is why we decided to give LightlyTrain distillation a try.”

Ana-Maria Pelin

ML Trainee

"Through this collaboration, SDSC and Lightly have combined their expertise to revolutionize the process of frame selection in surgical videos, making it more efficient and accurate than ever before to find the best subset of frames for labeling and model training."

Margaux Masson-Forsythe

Director of Machine Learning

“Lightly enabled us to improve our ML data pipeline in all regards: Selection, Efficiency, and Functionality. This allowed us to cut customer onboarding time by 50% while achieving better model performance.”

Harishma Dayanidhi

Co-Founder/ VP of Engineering

“By integrating Lightly into our existing workflow, we achieved a 90% reduction in dataset size and doubled the efficiency of our deployment process. The tool’s seamless implementation significantly enhanced our data pipeline.”

Usman Khan

Sr. Data Scientist