Home  >  Article  >  Technology peripherals  >  Application of commonly used distance measurement methods in K nearest neighbor algorithm

Application of commonly used distance measurement methods in K nearest neighbor algorithm

王林
王林forward
2024-01-22 20:54:10620browse

The k nearest neighbor algorithm is an instance-based or memory-based machine learning algorithm for classification and recognition. Its principle is to classify by finding the nearest neighbor data of a given query point. Since the algorithm relies heavily on stored training data, it can be viewed as a non-parametric learning method.

k nearest neighbor algorithm is suitable for processing classification or regression problems. For classification problems it works with discrete values ​​whereas for regression problems it works with continuous values. Before classification, distance must be defined, and there are many choices for common distance measures.

Euclidean Distance

This is a commonly used distance measure, suitable for real-valued vectors. The formula measures the straight-line distance between a query point and another point.

Application of commonly used distance measurement methods in K nearest neighbor algorithm

Euclidean distance formula

Manhattan distance

This is also a popular distance measure that measures the absolute value between two points.

Application of commonly used distance measurement methods in K nearest neighbor algorithm

Manhattan distance formula

Minkowski distance

This distance measure is a generalized form of the Euclidean and Manhattan distance measures.

Application of commonly used distance measurement methods in K nearest neighbor algorithm

Minkowski distance formula

Hamming distance

This technique is often used with Boolean or string vectors to identify points where the vectors do not match. Therefore, it is also called overlap measure.

Application of commonly used distance measurement methods in K nearest neighbor algorithm

Hamming distance formula

Determine the significance of k nearest neighbor algorithm distance

In order to determine which data points are closest to a given query point, it is necessary to calculate the distance between the query point and Distances between other data points. These distance measures help form decision boundaries that divide query points into different regions.

The above is the detailed content of Application of commonly used distance measurement methods in K nearest neighbor algorithm. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:163.com. If there is any infringement, please contact admin@php.cn delete