Home  >  Article  >  Backend Development  >  PHP implements a method to determine spam comments through Chinese character ratio, _PHP tutorial

PHP implements a method to determine spam comments through Chinese character ratio, _PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:16:311117browse

PHP implements the method of judging spam comments through the ratio of Chinese characters,

The example in this article describes how PHP implements the method of judging spam comments through the ratio of Chinese characters. Share it with everyone for your reference. The specific implementation method is as follows:

1. Demand:

This type of spam comments have often appeared in recent times: a large paragraph of English characters mixed with one or two rare Chinese characters, including Chinese characters, and it does not contain any Chinese sensitive words, so it passed the comment filter openly. The processing of such comments can be confirmed by judging the ratio of Chinese characters, but there will also be certain misjudgments.

2. Solution:

You need to use the two functions strlen and mb_strlen of PHP. strlen will identify the length of a single Chinese character as 3, and mb_strlen will identify the length of a single Chinese character as 1. The difference between the lengths of the same character segment obtained by the two functions is twice the actual number of Chinese characters. Divide by two to get the actual number of characters. Compute the ratio with the length obtained by mb_strlen to get the ratio of Chinese characters to the total number of characters.

3. Implementation code:

Copy code The code is as follows:
$len_all = strlen($comment['text']); $len_st = mb_strlen($comment['text'], 'UTF-8');
if(($len_all-$len_st)/(2*$len_st) < 0.5){
​​​​$error = "Less than 50% Chinese characters";
}
If you post code in the comments, the ratio of Chinese characters will be low, and you need to filter out the code field before making a judgment.

I hope this article will be helpful to everyone’s PHP programming design.

http://www.bkjia.com/PHPjc/897010.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/897010.htmlTechArticlePHP implements the method of judging spam comments through the ratio of Chinese characters. This article describes the example of PHP implementing the method of judging spam comments through the ratio of Chinese characters. How to detect spam comments. Share it with everyone for your reference...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn