Home  >  Article  >  Backend Development  >  An interview question about php string interception

An interview question about php string interception

WBOY
WBOYOriginal
2016-09-21 14:13:10934browse

<code>$str = '这是<div>一道<a href="http://www.baidu.com">php字符串</a>截取题</div>。';
</code>

Truncate the first 7 characters of the above string and display it. The final result should be this:

<code>'这是<div>一道<a href="http://www.baidu.com">php</a></div>'
</code>

Requirements:

  1. If there is an HTML tag in the string, skip it and don’t count it

  2. If any HTML tags are truncated after interception, then the truncated tags must be added with a closing tag at the end

Reply content:

<code>$str = '这是<div>一道<a href="http://www.baidu.com">php字符串</a>截取题</div>。';
</code>

Truncate the first 7 characters of the above string and display it. The final result should be this:

<code>'这是<div>一道<a href="http://www.baidu.com">php</a></div>'
</code>

Requirements:

  1. If there is an HTML tag in the string, skip it and don’t count it

  2. If any HTML tags are truncated after interception, then the truncated tags must be added with a closing tag at the end

I didn’t speculate on the purpose of the question, I simply wrote a regular replacement as required

<code class="php">function pure_cut($str, $len) {
    $reg = '/' . str_repeat('[^<>]((?:<[^>]+>)+)?', $len) . '$/u';
    $str = preg_replace_callback($reg, function($matches) {
        array_shift($matches);
        $replace = join('', $matches);
        return $replace;
    }, $str, 7);
    return $str;
}

echo pure_cut($str, 7);</code>

But I don’t quite understand requirement 2. When requirement 1 is met, the html tag will not be damaged and does not need to be repaired.

It should be to capture the content of the rich text editing box.

<code class="php"><?php 

$str = '这是<div>一道<a href="http://www.baidu.com">php字符串</a>截取题</div>。';

function truncate($text, $length = 100, $ending = '...', $exact = false, $considerHtml = true) {
    if ($considerHtml) {
            if (mb_strlen(strip_tags($text)) <= $length) {
                    return $text;
            }

            preg_match_all('/(<.+?>)?([^<>]*)/s', $text, $lines, PREG_SET_ORDER);
            $total_length = mb_strlen($ending);
            $open_tags = array();
            $truncate = '';

            foreach ($lines as $line_matchings) {
                    if (!empty($line_matchings[1])) {
                            if (preg_match('/^<(\s*.+?\/\s*|\s*(img|br|input|hr|area|base|basefont|col|frame|isindex|link|meta|param)(\s.+?)?)>$/is', $line_matchings[1])) {
                            } else if (preg_match('/^<\s*\/([^\s]+?)\s*>$/s', $line_matchings[1], $tag_matchings)) {
                                    $pos = array_search($tag_matchings[1], $open_tags);
                                    if ($pos !== false) {
                                    unset($open_tags[$pos]);
                                    }
                            } else if (preg_match('/^<\s*([^\s>!]+).*?>$/s', $line_matchings[1], $tag_matchings)) {
                                    array_unshift($open_tags, strtolower($tag_matchings[1]));
                            }
                            $truncate .= $line_matchings[1];
                    }
                    $content_length = mb_strlen(preg_replace('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', ' ', $line_matchings[2]));
                    if ($total_length+$content_length > $length) {
                            $left = $length - $total_length;
                            $entities_length = 0;
                            if (preg_match_all('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', $line_matchings[2], $entities, PREG_OFFSET_CAPTURE)) {
                                    foreach ($entities[0] as $entity) {
                                            if ($entity[1]+1-$entities_length <= $left) {
                                                    $left--;
                                                    $entities_length += mb_strlen($entity[0]);
                                            } else {
                                                    break;
                                            }
                                    }
                            }
                            $truncate .= mb_substr($line_matchings[2], 0, $left+$entities_length);
                            break;
                    } else {
                            $truncate .= $line_matchings[2];
                            $total_length += $content_length;
                    }
                    if($total_length >= $length) {
                            break;
                    }
            }
    } else {
            if (mb_strlen($text) <= $length) {
                    return $text;
            } else {
                    $truncate = mb_substr($text, 0, $length - mb_strlen($ending));
            }
    }
    if (!$exact) { 
            $spacepos = mb_strrpos($truncate, ' ');
            if (isset($spacepos)) {
                    $truncate = mb_substr($truncate, 0, $spacepos);
            }
    }
    $truncate .= $ending;
    if($considerHtml) {
            foreach ($open_tags as $tag) {
                    $truncate .= '</' . $tag . '>';
            }
    }
    return $truncate;
}

echo truncate($str, 7, '', true, true);</code>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn