Home >Backend Development >PHP Tutorial >Function to calculate string similarity in PHP_PHP tutorial

Function to calculate string similarity in PHP_PHP tutorial

WBOY
WBOYOriginal
2016-07-20 11:03:211133browse

A detailed introduction to the similar_text and similarity levenshtein functions for calculating string similarity in PHP. Below we will introduce in detail the introduction of string similarity. ​

similar_text — Calculate the similarity of two strings
int similar_text ( string $first , string $second [, float &$percent ] )
$first is required. Specifies the first string to compare.
$second is required. Specifies the second string to be compared.
$percent optional. Specifies the variable name used to store percent similarity.

The similarity between two strings is calculated according to the description of Oliver [1993]. Note that this implementation does not use the stack in Oliver's virtual code, but does make recursive calls, which may make the entire process slower or faster. Also note that the complexity of this algorithm is O(N**3), where N is the length of the longest string.

For example, we want to find the similarity between the string abcdefg and the string aeg:

The code is as follows Copy code
 代码如下 复制代码

$first = "abcdefg";
$second = "aeg";
 
echo similar_text($first, $second);结果输出3.如果想以百分比显示,则可使用它的第三个参数,如下:

$first = "abcdefg";
$second = "aeg";
 
similar_text($first, $second, $percent);
echo $percent;

$first = "abcdefg";

$second = "aeg";

echo similar_text($first, $second); The result output is 3. If you want to display it in percentage, you can use its third parameter, as follows:

$first = "abcdefg";

$second = "aeg";

similar_text($first, $second, $percent);

echo $percent;

Usage and implementation process of similar_text function. The similar_text() function is mainly used to calculate the number of matching characters in two strings, and can also calculate the similarity (in percentage) of two strings. The levenshtein() function we are going to introduce today is faster compared to the similar_text() function. However, the similar_text() function provides more accurate results with fewer modifications required. You can consider using the levenshtein() function when you are pursuing speed but less accuracy, and the string length is limited.


Instructions for use
First read the description of levenshtein() function in the manual:

The levenshtein() function returns the Levenshtein distance between two strings.

Levenshtein distance, also known as edit distance, refers to the minimum number of edit operations required between two strings to convert one into the other. Permitted editing operations include replacing one character with another, inserting a character, and deleting a character.

For example, convert kitten to sitting:

sitten (k→s)
sittin (e→i)
The sitting (→g) levenshtein() function gives equal weight to each operation (replacement, insertion, and deletion). However, you can define the cost of each operation by setting the optional insert, replace, and delete parameters.

Syntax:

levenshtein(string1,string2,insert,replace,delete)


Parameter Description

•string1 Required. The first string to compare.

•string2 required. The second string to compare.
 代码如下 复制代码

    echo levenshtein("Hello World","ello World");
    echo "
";
    echo levenshtein("Hello World","ello World",10,20,30);
?>

•insert optional. The cost of inserting a character. The default is 1. •replace optional. The cost of replacing a character. The default is 1. •delete optional. The cost of deleting a character. The default is 1. Tips and Notes •The levenshtein() function returns -1 if one of the strings exceeds 255 characters. •levenshtein() function is not case sensitive. •levenshtein() function is faster than similar_text() function. However, the similar_text() function provides more accurate results that require fewer modifications. Example
The code is as follows Copy code
echo levenshtein("Hello World","ello World"); echo " "; echo levenshtein("Hello World","ello World",10,20,30); ?>

Output: 1 30


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/445299.htmlTechArticleA detailed introduction to the similar_text and similarity levenshtein functions for calculating string similarity in PHP. Below we will introduce in detail An introduction to string similarity. similar_text counts two words...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn