🎉 Big news: LightlyTrain now supports DINOv2. Read our announcement.

A-Z of Machine Learning and Computer Vision Terms

Generative Adversarial Networks

Generative Pre-Trained Transformer

Histogram of Oriented Gradients (HOG)

Human Pose Estimation

Human in the Loop (HITL)

Hyperparameter Tuning

Intersection over Union (IoU)

Jaccard Index

Jupyter Notebooks

K-Means Clustering

K-Nearest Neighbor (KNN)

Large Language Model (LLM)

Latent Dirichlet Allocation (LDA)

Latent Space

Learning Rate

Linear Discriminant Analysis (LDA)

Linear Regression

Logistic Regression

Long Short-Term Memory (LSTM)

Loss Function

Machine Learning (ML)

Manifold Learning

Markov Chains

Mean Average Precision (mAP)

Mean Squared Error (MSE)

Medical Image Segmentation

Natural Language Processing (NLP)

Neural Architecture Search

Neural Networks

Neural Style Transfer

Optical Character Recognition (OCR)

Optimization Algorithms

Outlier Detection

Overfitting

PACS (Picture Archiving and Communication System)

PR AUC

Pandas and NumPy

Panoptic Segmentation

Parameter-Efficient Fine-Tuning (Prefix-Tuning)

Predictive Model Validation

Principal Component Analysis

Clustering

Clustering is an unsupervised machine learning task that involves grouping a set of data points into subsets, or "clusters," such that data points within the same cluster are more similar to each other than to those in other clusters. Unlike supervised learning, clustering does not rely on predefined labels; instead, it discovers inherent structures or patterns in the data based on similarity measures (e.g., distance metrics like Euclidean distance). The primary objective of clustering is to explore data, identify natural groupings, and gain insights into the underlying distribution of the data. Common clustering algorithms include K-Means, Hierarchical Clustering, DBSCAN, and Gaussian Mixture Models. Clustering is widely applied in various fields such as customer segmentation, anomaly detection, document analysis, image segmentation, and bioinformatics, providing a powerful tool for exploratory data analysis and pattern discovery.