substr_count

(PHP 4, PHP 5)

substr_count — 计算字串出现的次数

说明

int substr_count ( string $haystack , string $needle [, int $offset = 0 [, int $length ]] )

substr_count() 返回子字符串needle 在字符串 haystack 中出现的次数。注意 needle 区分大小写。

Note:

该函数不会计算重叠字符串。参见下面的例子。

参数

haystack: 在此字符串中进行搜索。
needle: 要搜索的字符串。
offset: 开始计数的偏移位置。
length: 指定偏移位置之后的最大搜索长度。如果偏移量加上这个长度的和大于 haystack 的总长度，则打印警告信息。

返回值

该函数返回整型。

更新日志

版本	说明
5.1.0	新增 `offset` 和 `length` 参数。

范例

Example #1 substr_count() 范例

  <?php
$text  =  'This is a test' ;
echo  strlen ( $text );  // 14

 echo  substr_count ( $text ,  'is' );  // 2

// 字符串被简化为 's is a test'，因此输出 1
 echo  substr_count ( $text ,  'is' ,  3 );

 // 字符串被简化为 's i'，所以输出 0
 echo  substr_count ( $text ,  'is' ,  3 ,  3 );

 // 因为 5+10 > 14，所以生成警告
 echo  substr_count ( $text ,  'is' ,  5 ,  10 );


 // 输出 1，因为该函数不计算重叠字符串
 $text2  =  'gcdgcdgcd' ;
echo  substr_count ( $text2 ,  'gcdgcd' );
 ?>

参见

count_chars() - 返回字符串所用字符的信息
strpos() - 查找字符串首次出现的位置
substr() - 返回字符串的子串
strstr() - 查找字符串的首次出现

用户评论:

[#1] tweston at bangordailynews dot com [2015-04-15 15:10:52]

To account for the case that jrhodes has pointed out, we can change the line to:

substr_count ( implode( ',', $haystackArray ), $needle );

This way:

array (
0 => "mystringth",
1 => "atislong"
);

Becomes

mystringth,atislong

Which brings the count for $needle = "that" to 0 again.

[#2] php at blink dot at [2014-07-07 13:30:12]

This will handle a string where it is unknown if comma or period are used as thousand or decimal separator. Only exception where this leads to a conflict is when there is only a single comma or period and 3 possible decimals (123.456 or 123,456). An optional parameter is passed to handle this case (assume thousands, assume decimal, decimal when period, decimal when comma). It assumes an input string in any of the formats listed below.

function toFloat($pString, $seperatorOnConflict="f")
{
$decSeperator=".";
$thSeperator="";

$pString=str_replace(" ", $thSeperator, $pString);

$firstPeriod=strpos($pString, ".");
$firstComma=strpos($pString, ",");
if($firstPeriod!==FALSE && $firstComma!==FALSE) {
if($firstPeriod<$firstComma) {
$pString=str_replace(".", $thSeperator, $pString);
$pString=str_replace(",", $decSeperator, $pString);
}
else {
$pString=str_replace(",", $thSeperator, $pString);
}
}
else if($firstPeriod!==FALSE || $firstComma!==FALSE) {
$seperator=$firstPeriod!==FALSE?".":",";
if(substr_count($pString, $seperator)==1) {
$lastPeriodOrComma=strpos($pString, $seperator);
if($lastPeriodOrComma==(strlen($pString)-4) && ($seperatorOnConflict!=$seperator && $seperatorOnConflict!="f")) {
$pString=str_replace($seperator, $thSeperator, $pString);
}
else {
$pString=str_replace($seperator, $decSeperator, $pString);
}
}
else {
$pString=str_replace($seperator, $thSeperator, $pString);
}
}
return(float)$pString;
}

function testFloatParsing() {
    $floatvals = array(
        "22 000",
        "22,000",
        "22.000",
        "123 456",
        "123,456",
        "123.456",
        "22 000,76",
        "22.000,76",
        "22,000.76",
        "22000.76",
        "22000,76",
        "1.022.000,76",
        "1,022,000.76",
        "1,000,000",
        "1.000.000",
        "1022000.76",
        "1022000,76",
        "1022000",
        "0.76",
        "0,76",
        "0.00",
        "0,00",
        "1.00",
        "1,00",
        "-22 000,76",
        "-22.000,76",
        "-22,000.76",
        "-22 000",
        "-22,000",
        "-22.000",
        "-22000.76",
        "-22000,76",
        "-1.022.000,76",
        "-1,022,000.76",
        "-1,000,000",
        "-1.000.000",
        "-1022000.76",
        "-1022000,76",
        "-1022000",
        "-0.76",
        "-0,76",
        "-0.00",
        "-0,00",
        "-1.00",
        "-1,00"
    );

    echo "<table>
        <tr>
            <th>String</th>
            <th>thousands</th>
            <th>fraction</th>
            <th>dec. if period</th>
            <th>dec. if comma</th>
        </tr>";

    foreach ($floatvals as $fval) {
        echo "<tr>";
        echo "<td>" . (string) $fval . "</td>";

        echo "<td>" . (float) toFloat($fval, "") . "</td>";
        echo "<td>" . (float) toFloat($fval, "f") . "</td>";
        echo "<td>" . (float) toFloat($fval, ".") . "</td>";
        echo "<td>" . (float) toFloat($fval, ",") . "</td>";
        echo "</tr>";
    }
    echo "</table>";
}

[#3] qeremy [atta] gmail [dotta] com [2013-09-20 20:45:04]

Unicode example with "case-sensitive" option;

<?php function substr_count_unicode($str, $substr, $caseSensitive = true, $offset = 0, $length = null) { if ($offset) { $str = substr_unicode($str, $offset, $length); } $pattern = $caseSensitive ? '~(?:'. preg_quote($substr) .')~u' : '~(?:'. preg_quote($substr) .')~ui'; preg_match_all($pattern, $str, $matches); return isset($matches[0]) ? count($matches[0]) : 0; } function substr_unicode($str, $start, $length = null) { return join('', array_slice( preg_split('~~u', $str, -1, PREG_SPLIT_NO_EMPTY), $start, $length)); } $s = '?mit y??z??m g?z??m...'; print substr_count_unicode($s, '??'); // 3 print substr_count_unicode($s, '??', false); // 4 print substr_count_unicode($s, '??', false, 10); // 1 print substr_count_unicode($s, '??m'); // 2 print substr_count_unicode($s, '??m', false); // 3 ?>

[#4] chrisstocktonaz at gmail dot com [2009-08-26 16:52:21]

In regards to anyone thinking of using code contributed by zmindster at gmail dot com

Please take careful consideration of possible edge cases with that regex, in example:

$url = 'http://w3.host.tld/path/to/file/..../file.extension';
$url = 'http://w3.host.tld/path/to/file/../file.extension?malicous=....';

This would cause a infinite loop and for example be a possible entry point for a denial of service attack. A correct fix would require additional code, a quick hack would be just adding a additional check, without clarity or performance in mind:

...
$i = 0;
while (substr_count($url, '../') && ++$i < strlen($url))
...

-Chris

[#5] jrhodes at roket-enterprises dot com [2009-06-18 01:37:48]

It was suggested to use

substr_count ( implode( $haystackArray ), $needle );

instead of the function described previously, however this has one flaw.  For example this array:

array (
  0 => "mystringth",
  1 => "atislong"
);

If you are counting "that", the implode version will return 1, but the function previously described will return 0.

[#6] gigi at phpmycoder dot com [2009-01-12 08:17:33]

below was suggested a function for substr_count'ing an array, yet for a simpler procedure, use the following:

<?php substr_count ( implode( $haystackArray ), $needle ); ?>

[#7] info at fat-fish dot co dot il [2007-05-05 16:07:11]

a simple version for an array needle (multiply sub-strings):
<?php function substr_count_array( $haystack, $needle ) { $count = 0; foreach ($needle as $substring) { $count += substr_count( $haystack, $substring); } return $count; } ?>

[#8] flobi at flobi dot com [2006-10-25 19:07:11]

Making this case insensitive is easy for anyone who needs this. Simply convert the haystack and the needle to the same case (upper or lower).

substr_count(strtoupper($haystack), strtoupper($needle))

[#9] XinfoX X at X XkarlX X-X XphilippX X dot X XdeX [2003-12-21 13:27:00]

Yet another reference to the "cgcgcgcgcgcgc" example posted by "chris at pecoraro dot net":

Your request can be fulfilled with the Perl compatible regular expressions and their lookahead and lookbehind features.

The example

$number_of_full_pattern = preg_match_all('/(cgc)/', "cgcgcgcgcgcgcg", $chunks);

works like the substr_count function. The variable $number_of_full_pattern has the value 3, because the default behavior of Perl compatible regular expressions is to consume the characters of the string subject that were matched by the (sub)pattern. That is, the pointer will be moved to the end of the matched substring.
But we can use the lookahead feature that disables the moving of the pointer:

$number_of_full_pattern = preg_match_all('/(cg(?=c))/', "cgcgcgcgcgcgcg", $chunks);

In this case the variable $number_of_full_pattern has the value 6.
Firstly a string "cg" will be matched and the pointer will be moved to the end of this string. Then the regular expression looks ahead whether a 'c' can be matched. Despite of the occurence of the character 'c' the pointer is not moved.