PHP improves the functions similar_text() and levenshtein() for calculating string similarity, levenshtein

Home

Backend Development

PHP Tutorial

PHP improves the functions similar_text() and levenshtein() for calculating string similarity, levenshtein_PHP tutorial

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jul 13, 2016 am 10:15 AM

PHP improves the functions similar_text() and levenshtein() for calculating string similarity, levenshtein

similar_text() Chinese character version

Copy code The code is as follows:

//Split string
Function split_str($str) {
Preg_match_all("/./u", $str, $arr);
Return $arr[0];
}

//Similarity detection
Function similar_text_cn($str1, $str2) {
        $arr_1 = array_unique(split_str($str1));
        $arr_2 = array_unique(split_str($str2));
$similarity = count($arr_2) - count(array_diff($arr_2, $arr_1));

Return $similarity;
}

levenshtein() Chinese character version

Copy code The code is as follows:

          //拆分字符串
     function mbStringToArray($string, $encoding = 'UTF-8') {
         $arrayResult = array();
         while ($iLen = mb_strlen($string, $encoding)) {
             array_push($arrayResult, mb_substr($string, 0, 1, $encoding));
             $string = mb_substr($string, 1, $iLen, $encoding);
         }
         return $arrayResult;
     }
     //编辑距离
     function levenshtein_cn($str1, $str2, $costReplace = 1, $encoding = 'UTF-8') {
         $count_same_letter = 0;
         $d = array();
         $mb_len1 = mb_strlen($str1, $encoding);
         $mb_len2 = mb_strlen($str2, $encoding);
         $mb_str1 = mbStringToArray($str1, $encoding);
         $mb_str2 = mbStringToArray($str2, $encoding);
         for ($i1 = 0; $i1              $d[$i1] = array();
             $d[$i1][0] = $i1;
         }
         for ($i2 = 0; $i2              $d[0][$i2] = $i2;
         }
         for ($i1 = 1; $i1              for ($i2 = 1; $i2                  // $cost = ($str1[$i1 - 1] == $str2[$i2 - 1]) ? 0 : 1;
                 if ($mb_str1[$i1 - 1] === $mb_str2[$i2 - 1]) {
                     $cost = 0;
                     $count_same_letter++;
                 } else {
                     $cost = $costReplace; //替换
                 }
                 $d[$i1][$i2] = min($d[$i1 - 1][$i2] + 1, //插入
                 $d[$i1][$i2 - 1] + 1, //删除
                 $d[$i1 - 1][$i2 - 1] + $cost);
             }
         }
         return $d[$mb_len1][$mb_len2];
         //return array('distance' => $d[$mb_len1][$mb_len2], 'count_same_letter' => $count_same_letter);
     }

最长公共子序列LCS()

复制代码代码如下:

                  //最长公共子序列英文版
         function LCS_en($str_1, $str_2) {
           $len_1 = strlen($str_1);
           $len_2 = strlen($str_2);
           $len = $len_1 > $len_2 ? $len_1 : $len_2;
           $dp = array();
           for ($i = 0; $i              $dp[$i] = array();
             $dp[$i][0] = 0;
             $dp[0][$i] = 0;
           }
           for ($i = 1; $i              for ($j = 1; $j                if ($str_1[$i - 1] == $str_2[$j - 1]) {
                 $dp[$i][$j] = $dp[$i - 1][$j - 1] + 1;
               } else {
                 $dp[$i][$j] = $dp[$i - 1][$j] > $dp[$i][$j - 1] ? $dp[$i - 1][$j] : $dp[$i][$j - 1];
               }
             }
           }
           return $dp[$len_1][$len_2];
         }
         //拆分字符串
         function mbStringToArray($string, $encoding = 'UTF-8') {
           $arrayResult = array();
           while ($iLen = mb_strlen($string, $encoding)) {
             array_push($arrayResult, mb_substr($string, 0, 1, $encoding));
             $string = mb_substr($string, 1, $iLen, $encoding);
           }
           return $arrayResult;
         }
         //最长公共子序列中文版
         function LCS_cn($str1, $str2, $encoding = 'UTF-8') {
           $mb_len1 = mb_strlen($str1, $encoding);
           $mb_len2 = mb_strlen($str2, $encoding);
           $mb_str1 = mbStringToArray($str1, $encoding);
           $mb_str2 = mbStringToArray($str2, $encoding);
           $len = $mb_len1 > $mb_len2 ? $mb_len1 : $mb_len2;
           $dp = array();
           for ($i = 0; $i              $dp[$i] = array();
             $dp[$i][0] = 0;
             $dp[0][$i] = 0;
           }
           for ($i = 1; $i for ($j = 1; $j If ($mb_str1[$i - 1] == $mb_str2[$j - 1]) {
$dp[$i][$j] = $dp[$i - 1][$j - 1] + 1;
                                                                                                       $dp[$i][$j] = $dp[$i - 1][$j] > $dp[$i][$j - 1] ? $dp[$i - 1][$j] : $dp[$i][$j - 1];
                                                                                                                                                                                                                                                                                                                                                                        return $dp[$mb_len1][$mb_len2];
           }

(100 points)[php]Write a few string processing functions you are familiar with!

addcslashes addslashes bin2hex chop chr chunk_split convert_cyr_string cyrillic

convert_uudecode convert_uuencode count_chars crc32 crc32 crypt echo explode

fprintf get_html_translation_table hebrev

hebrevc
hex2bin — De codes a hexadecimally encoded binary string
html_entity_decode — Convert all HTML entities to their applicable characters
htmlentities — Convert all applicable characters to HTML entities
htmlspecialchars_decode — Convert special HTML entities back to characters
htmlspecialchars — Convert special characters to HTML entities
implode — Join array elements with a string
join

lcfirst — Make a string's first character lowercase
levenshtein — Calculate Levenshtein distance between two strings
localeconv — Get numeric formatting information
ltrim — Strip whitespace (or other characters) from the beginning of a string
md5_file
metaphone — Calculate the metaphone key of a string
money_format — Formats a number as a currency string
nl_langinfo — Query language and locale information
nl2br

number_format — Format a number with grouped thousands
ord

parse_str

print

printf

quoted_printable_decode — Convert a quoted-printable string to an 8 bit string
quoted_printable_encode — Convert a 8 bit string to a quoted-printable string
quotemeta — Quote meta characters
rtrim
setlocale — Set locale information
sha1_file

sha1

soundex — Calculate the soundex key of a string
sprintf — Return a formatted string
sscanf — Parses input from a string according to a .. ....The rest of the text>>

Can you give an easy-to-understand explanation for the levenshtein function of php? I can’t understand the manual

W3School’s explanation:

levenshtein() function returns the Levenshtein distance between two strings.

Levenshtein distance, also known as edit distance, refers to the minimum number of edit operations required between two strings to convert one into the other. Permitted editing operations include replacing one character with another, inserting a character, and deleting a character.

For example, convert kitten to sitting:

sitten (k→s)
sittin (e→i)
sitting (→g)
levenshtein() function gives each operation (replacement, Insertions and deletions) have the same weight. However, you can define the cost of each operation by setting the optional insert, replace, and delete parameters.

Explanation: "Cost" is the weight. In the poster's example, Hello World → ello World, you need to "delete" "H", that is, the fifth parameter is used, and the corresponding weight is 30, so 30 is returned.

http://www.bkjia.com/PHPjc/901291.html

www.bkjia.com

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Explain the concept of a PHP session in simple terms.Apr 26, 2025 am 12:09 AM

PHPsessionstrackuserdataacrossmultiplepagerequestsusingauniqueIDstoredinacookie.Here'showtomanagethemeffectively:1)Startasessionwithsession_start()andstoredatain$_SESSION.2)RegeneratethesessionIDafterloginwithsession_regenerate_id(true)topreventsessi

How do you loop through all the values stored in a PHP session?Apr 26, 2025 am 12:06 AM

In PHP, iterating through session data can be achieved through the following steps: 1. Start the session using session_start(). 2. Iterate through foreach loop through all key-value pairs in the $_SESSION array. 3. When processing complex data structures, use is_array() or is_object() functions and use print_r() to output detailed information. 4. When optimizing traversal, paging can be used to avoid processing large amounts of data at one time. This will help you manage and use PHP session data more efficiently in your actual project.

Explain how to use sessions for user authentication.Apr 26, 2025 am 12:04 AM

The session realizes user authentication through the server-side state management mechanism. 1) Session creation and generation of unique IDs, 2) IDs are passed through cookies, 3) Server stores and accesses session data through IDs, 4) User authentication and status management are realized, improving application security and user experience.

Give an example of how to store a user's name in a PHP session.Apr 26, 2025 am 12:03 AM

Tostoreauser'snameinaPHPsession,startthesessionwithsession_start(),thenassignthenameto$_SESSION['username'].1)Usesession_start()toinitializethesession.2)Assigntheuser'snameto$_SESSION['username'].Thisallowsyoutoaccessthenameacrossmultiplepages,enhanc

What are some common problems that can cause PHP sessions to fail?Apr 25, 2025 am 12:16 AM

Reasons for PHPSession failure include configuration errors, cookie issues, and session expiration. 1. Configuration error: Check and set the correct session.save_path. 2.Cookie problem: Make sure the cookie is set correctly. 3.Session expires: Adjust session.gc_maxlifetime value to extend session time.

How do you debug session-related issues in PHP?Apr 25, 2025 am 12:12 AM

Methods to debug session problems in PHP include: 1. Check whether the session is started correctly; 2. Verify the delivery of the session ID; 3. Check the storage and reading of session data; 4. Check the server configuration. By outputting session ID and data, viewing session file content, etc., you can effectively diagnose and solve session-related problems.

What happens if session_start() is called multiple times?Apr 25, 2025 am 12:06 AM

Multiple calls to session_start() will result in warning messages and possible data overwrites. 1) PHP will issue a warning, prompting that the session has been started. 2) It may cause unexpected overwriting of session data. 3) Use session_status() to check the session status to avoid repeated calls.

How do you configure the session lifetime in PHP?Apr 25, 2025 am 12:05 AM

Configuring the session lifecycle in PHP can be achieved by setting session.gc_maxlifetime and session.cookie_lifetime. 1) session.gc_maxlifetime controls the survival time of server-side session data, 2) session.cookie_lifetime controls the life cycle of client cookies. When set to 0, the cookie expires when the browser is closed.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

4 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

4 weeks agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

1 months agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

Hot Tools

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.