similar_text算相似性时归一化时的疑义 -PHP Tutorial-php.cn

Home

Backend Development

PHP Tutorial

similar_text算相似性时归一化时的疑义

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 13, 2016 pm 01:14 PM

echonbspquottext

similar_text算相似性时归一化时的疑问
我在算两个字符串的长度时，发现归一化时好像此函数采取的方式不一样。
第一次，我试了两个不一样长的字符串，算其编辑距离：
echo "levenshtein计算：\n";echo levenshtein("seller_id","selr_id");echo "\n";
得到的结果是：2

再用同样的两个字符串，用PHP的similar_text函数来求其相似性
echo "similar_text计算：\n";similar_text("seller_id","selr_id",$percent);
echo $percent;
出现在相似性是：87.5
把2这个距离归一化时，正好符合公式：1-（编辑距离/(两个字符串的长度之和)）

第二次，我试了两个一样长度的字符串，分别算其编辑距离和相似性
similar_text("abcd","1234",$percent);echo $percent;echo "\n";
echo levenshtein("abcd","1234");
得到的值分别为：4和0
正好符合公式：1-（编辑距离/(任一个字符串的长度)）

我的问题是：为什么对两个不一样长的字符串求相似性时，分母是两个字符串的长度之和呢？
我在网上找了些pdf文档看，对编辑距离归一化时，其分母是最长的那个字符串的长度呢。

------解决方案--------------------
应该说 similar_text 函数的设计者，考虑的还是蛮周到的
当传入的两个串长度相同时，计算的相似度与理论上并无差异
当传入的两个串长度不同时，得到的相似度不像理论上的那么陡峭。也就是说被匹配的概率变大
当然如果你不希望这样的话可以自行计算，串都是你的，他也返回了已匹配的数量。计算一下并不困难

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

What is the difference between unset() and session_destroy()?May 04, 2025 am 12:19 AM

Thedifferencebetweenunset()andsession_destroy()isthatunset()clearsspecificsessionvariableswhilekeepingthesessionactive,whereassession_destroy()terminatestheentiresession.1)Useunset()toremovespecificsessionvariableswithoutaffectingthesession'soveralls

What is sticky sessions (session affinity) in the context of load balancing?May 04, 2025 am 12:16 AM

Stickysessionsensureuserrequestsareroutedtothesameserverforsessiondataconsistency.1)SessionIdentificationassignsuserstoserversusingcookiesorURLmodifications.2)ConsistentRoutingdirectssubsequentrequeststothesameserver.3)LoadBalancingdistributesnewuser

What are the different session save handlers available in PHP?May 04, 2025 am 12:14 AM

PHPoffersvarioussessionsavehandlers:1)Files:Default,simplebutmaybottleneckonhigh-trafficsites.2)Memcached:High-performance,idealforspeed-criticalapplications.3)Redis:SimilartoMemcached,withaddedpersistence.4)Databases:Offerscontrol,usefulforintegrati

What is a session in PHP, and why are they used?May 04, 2025 am 12:12 AM

Session in PHP is a mechanism for saving user data on the server side to maintain state between multiple requests. Specifically, 1) the session is started by the session_start() function, and data is stored and read through the $_SESSION super global array; 2) the session data is stored in the server's temporary files by default, but can be optimized through database or memory storage; 3) the session can be used to realize user login status tracking and shopping cart management functions; 4) Pay attention to the secure transmission and performance optimization of the session to ensure the security and efficiency of the application.

Explain the lifecycle of a PHP session.May 04, 2025 am 12:04 AM

PHPsessionsstartwithsession_start(),whichgeneratesauniqueIDandcreatesaserverfile;theypersistacrossrequestsandcanbemanuallyendedwithsession_destroy().1)Sessionsbeginwhensession_start()iscalled,creatingauniqueIDandserverfile.2)Theycontinueasdataisloade

What is the difference between absolute and idle session timeouts?May 03, 2025 am 12:21 AM

Absolute session timeout starts at the time of session creation, while an idle session timeout starts at the time of user's no operation. Absolute session timeout is suitable for scenarios where strict control of the session life cycle is required, such as financial applications; idle session timeout is suitable for applications that want users to keep their session active for a long time, such as social media.

What steps would you take if sessions aren't working on your server?May 03, 2025 am 12:19 AM

The server session failure can be solved through the following steps: 1. Check the server configuration to ensure that the session is set correctly. 2. Verify client cookies, confirm that the browser supports it and send it correctly. 3. Check session storage services, such as Redis, to ensure that they are running normally. 4. Review the application code to ensure the correct session logic. Through these steps, conversation problems can be effectively diagnosed and repaired and user experience can be improved.

What is the significance of the session_start() function?May 03, 2025 am 12:18 AM

session_start()iscrucialinPHPformanagingusersessions.1)Itinitiatesanewsessionifnoneexists,2)resumesanexistingsession,and3)setsasessioncookieforcontinuityacrossrequests,enablingapplicationslikeuserauthenticationandpersonalizedcontent.

See all articles