Home >Backend Development >PHP Tutorial >Simple Implementation of PHP 'Related Article Recommendation' Function_PHP Tutorial
Generally, when building a content website, a list of articles related to the article needs to appear in each article. The method most people use is probably: build a keyword list, determine which keywords each article contains, and finally find the article most relevant to an article based on the keywords. For websites with more complex content, it is obviously more troublesome to determine the key list words.
Later I looked up some php functions and felt that the similar_text (php4, php5) function could meet my requirements very conveniently. The idea is: take out all article titles from the article list, compare all article titles with the current title, generate an array of comparison results, and use similar_text to compare these article titles with the original ones. Compare the article titles and rearrange the titles according to their similarity to get a list of articles that are similar to the original article.
The key functions used in this idea are:
int similar_text ( string $first, string $second [, float $percent] )
It returns the same number of bytes of the two root strings.
Following this idea, we create the following function. The function of this function is to rearrange the $arr_title array in an order similar to $title.
<?php $demo_title = "帮客之家"; $demo_arr_title = array("简单易懂的现代魔法","简单明了的现代魔法","简明扼要的古代魔法","不简单的现代魔法","很难懂的现代魔法"); $new_array = getSimilar($demo_title,$demo_arr_title); //print_r($new_array); echo "与[$demo_title]最相关的前三个文章是:<br/>"; for($j=0; $j<=2; $j++) { echo ($j+1).":".$new_array[$j]."<br/>"; } //$title当前标题,$arrayTitle为需要查找的数组 function getSimilar($title,$arr_title) { $arr_len = count($arr_title); for($i=0; $i<=($arr_len-1); $i++) { //取得两个字符串相似的字节数 $arr_similar[$i] = similar_text($arr_title[$i],$title); } arsort($arr_similar); //按照相似的字节数由高到低排序 reset($arr_similar); //将指针移到数组的第一单元 $index = 0; foreach($arr_similar as $old_index=>$similar) { $new_title_array[$index] = $arr_title[$old_index]; $index++; } return $new_title_array; } ?>
Program execution result:
与[帮客之家]最相关的前三个文章是: 1:简单明了的现代魔法 2:简单易懂的现代魔法 3:简明扼要的古代魔法
Some things to note:
The speed issues for similar_text seem to be only an issue for long sections of text (>20000 chars).
I found a huge performance improvement in my application by just testing if the string to be tested was less than 20000 chars before calling similar_text.
20000+ took 3-5 secs to process, anything else (10000 and below) took a fraction of a second. Fortunately for me, there was only a handful of instances with >20000 chars which I couldn't get a comparison % for.
If you want to use the text directly for comparison, it may be slower.