Home >Backend Development >PHP Tutorial >PHP levenshtein()
The levenshtein() is an inbuilt function in PHP which is used to determine a unit of distance called Levenshtein distance in comparison with two strings. The definition of Levenshtein distance stands for the total number of characters which are to be modified like replacing, inserting or deleting the input string to transform it into another string.
ADVERTISEMENT Popular Course in this category PHP DEVELOPER - Specialization | 8 Course Series | 3 Mock TestsStart Your Free Software Development Course
Web development, programming languages, Software testing & others
There is equal weightage given to all the above 3 modifications (replace, delete, insert) by default in PHP. But there is an option for us to input the cost or the weightage of each of these operations by giving the optional parameters for the above. The algorithm used for this function has a complexity of O(a*b) where a and b are the length of strings str1 and str2 respectively.
There are a few things to note of this function though:
Here we discuss the syntax and parameters:
Syntax:
levenshtein(str1,str2,insert,replace,delete)
Parameters:
The default value for all the last 3 parameters is 1.
Return Value: This function outputs the Levenshtein distance between the two input strings. It returns the value -1 if even any one of the total string characters crosses 255.
Let us take a few examples to understand the working of levenshtein function.
Code:
<?php // PHP code to determine levenshtein distance // between 2 strings $s1 and $s2 $s1 = 'rdo'; $s2 = 'rst'; print_r(levenshtein($s1, $s2)); ?>
Output:
This is a basic example where the 2 input strings s1 and s2 have one word each consisting of 3 different letters. Now the levenshtein function compares these 2 strings character by character and finds out the difference in the number of characters. Here there are 2 letters which are not in common out of the 3. So to make the first string the same as the second string we need to add the 2 letters “s,t” to it hence the output 2.
Code:
<?php // PHP code to determine levenshtein distance // between 2 strings $s1 and $s2 $s1 = 'first string'; $s2 = 'second string'; print_r(levenshtein($s1, $s2)); ?>
Output:
In this basic example, we can find out the levenshtein distance between the 2 input strings which are represented by s1 and s2 here. If we compare the characters of the two strings, we can see that they have one word in common I.e. “string”. And in the remaining words, it compares between “first” and “second” words and also with the common word “string”. Here the only letters not in common are “f,e,c,o,d” and the extra “s”. So levenshtein function returns the output as 6 meaning these 6 letters are the difference between these 2 input strings and using which these 2 strings can be made equal in terms of characters.
Code:
<?php // PHP code to determine levenshtein distance // between $s1 and $s2 $s1 = 'Common Three Words'; $s2 = 'Common Words'; echo("The Levenshtein distance is: "); print_r(levenshtein($s1, $s2)); ?>
Output:
Here in this example, we can see that the first string has 3 words whereas the second string has only 2 words. And we can notice that both of these 2 words in the second string are already present in the first string. Hence the only difference in characters here will be the word “Three” which 5 characters. An interesting thing to notice here that the output gives 6 which means that even the extra space is also considered as a character.
<?php // Giving a misspelled word as input $ip = 'giraffee'; // sample set array to compare with $word_list = array('cat','dog','cow','elephant', 'giraffe','eagle','pigeon','parrot','rabbit'); // Since shortest distance is not found yet $short = -1; // Looping through array to find the closest word foreach ($word_list as $word_list) { // Calculating the levenshtein distance between // input word and the current word $levn = levenshtein($ip, $word_list); // To check for the matching word if ($levn == 0) { // This is the closest one which is an perfect match $closest = $word_list; $short = 0; // Here we break from foreach loop // when the exact match is found break; } // When the distance shown here is less than shortest distance // found in next iteration or if the next shortest word is // yet to be found if ($levn <= $short || $short < 0) { // Setting the shortest distance and one having // closest match to the input word $close = $word_list; $short = $levn; } } echo "Input word: $ip\n"; if ($short == 0) { echo "The closest/exact match found to the input word is: $close\n"; } else { echo "Did you mean to spell: $close?\n"; } ?>
Output:
The above example shows us one of the different cases where this levenshtein function can be implemented. Here we are helping the user to correct a misspelled word by comparing it with a pre-defined set of an array which has the list of correct words.
So at first, we are accepting an input word from the user which is typically misspelt (giraffee). We are defining an array set of correct animal names as shown which also has the correct spelling for input word (giraffe). A foreach loop is used to iterate through the array list and find the closest word which is matching with the input and this is done with the help of levenshtein function. The loop breaks when an exact match or the closest one is found. At the end, we compare the distance with the short parameter and if the distance is 0 it means that an exact match is found for the input word which is then printed in the output.
So basically levenshtein function returns the distance in integer values returned by comparing the character by character of the 2 input strings given to it. The first two parameters are the input strings which are mandatory and the last 3 parameters are optional which represent the cost of delete, insert or replace operations.
The above is the detailed content of PHP levenshtein(). For more information, please follow other related articles on the PHP Chinese website!