Clustering is an unsupervised machine learning task that involves grouping a set of data points into subsets, or "clusters," such that data points within the same cluster are more similar to each other than to those in other clusters. Unlike supervised learning, clustering does not rely on predefined labels; instead, it discovers inherent structures or patterns in the data based on similarity measures (e.g., distance metrics like Euclidean distance). The primary objective of clustering is to explore data, identify natural groupings, and gain insights into the underlying distribution of the data. Common clustering algorithms include K-Means, Hierarchical Clustering, DBSCAN, and Gaussian Mixture Models. Clustering is widely applied in various fields such as customer segmentation, anomaly detection, document analysis, image segmentation, and bioinformatics, providing a powerful tool for exploratory data analysis and pattern discovery.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on device
Learn More