Home >Backend Development >PHP Tutorial > 【对比】PHP检测提交的段落是否有重复行,哪一种更好?解决办法

【对比】PHP检测提交的段落是否有重复行,哪一种更好?解决办法

WBOY
WBOYOriginal
2016-06-13 12:55:46837browse

【对比】PHP检测提交的段落是否有重复行,哪一种更好?
写的两个函数,对比提交的文本段落里面重复的有无,发现一些问题:
(1)in_array()检测中文有的时候会有问题,明明存在却提示不存在,长文本的时候概率更高
(2)有时候短段落重复3~4次是允许的,但是如果用similar_text作对比就导致只要有一次重复就拒绝用户提交了。如何改进才更好
(3)还有没有更好的方法,求~


<br>
<br>
<br>
function hasSimilarText($string)<br>
{<br>
    $lineArr = explode("\n",$string);<br>
    $arrStr = $arrLen = array();<br>
    foreach($lineArr as $k => $v)<br>
    {<br>
        $arrLen[] = strlen($v);<br>
        $arrStr[] = $v;<br>
    }<br>
<br>
    foreach($arrStr as $k1 => $v1)<br>
    {<br>
        foreach($arrStr as $k2 => $v2)<br>
        {<br>
            if($k1 == $k2) continue;<br>
            if($arrLen[$k2]  100) continue;<br>
            similar_text($v1, $v2, $pct);<br>
            if($pct > 90) return true;<br>
        }<br>
    }<br>
    return false;<br>
}<br>
<br>
<br>
/* 重复段落检测 */<br>
function hasRepeatLine($string)<br>
{<br>
    $string = str_replace(array("\t"," ","@","#","。",",",".",","),'',$string);<br>
    //$string = str_replace("\r","\n",$string);<br>
    $lineArr = explode("\n",$string);<br>
    $countShort = $countMiddle = $countLong = 0;<br>
    $arr = array();<br>
<br>
    foreach($lineArr as $lineString)<br>
    {<br>
        $length = strlen( $lineString );<br>
        if($length 
        if(in_array($lineString,$arr))<br>
        {<br>
            if($length 
            {<br>
                $countShort++;<br>
                if($countShort > 4) return true;//5次<br>
            } elseif($length>12 && $length 
                $countMiddle++;<br>
                if($countMiddle > 3) return true; //4次<br>
            } elseif($length>50 && $length 
                $countLong++;<br>
                if($countLong > 2) return true; //3次 <div class="clear">
                 
              
              
        
            </div>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn