K-Nearest Neighbor (KNN) is a simple, non-parametric, instance-based learning algorithm used for both classification and regression tasks. As a "lazy learner," KNN does not build a model during the training phase; instead, it memorizes the entire training dataset. When a new, unseen data point needs to be classified or predicted, KNN identifies the 'k' closest data points (neighbors) from the training set based on a distance metric (e.g., Euclidean distance). For classification, the new data point is assigned the class label that is most frequent among its k neighbors. For regression, it is assigned the average or median of the values of its k neighbors. The choice of 'k' is crucial for performance, and optimal 'k' often varies with the dataset. While simple and intuitive, KNN can be computationally expensive for large datasets during inference and sensitive to noisy data and irrelevant features.
Data Selection & Data Viewer
Get data insights and find the perfect selection strategy
Learn MoreSelf-Supervised Pretraining
Leverage self-supervised learning to pretrain models
Learn MoreSmart Data Capturing on Device
Find only the most valuable data directly on device
Learn More