


What we will bring to you today is about the requirement: intercept a piece of text to a certain physical length and display it. Note that what is to be intercepted is not the number of bytes of the string, UFT-8 The encoded Chinese characters are 3 bytes or 4 bytes, and when displayed, Chinese characters will occupy two characters, and English characters will only occupy one, and it is different when it is full-width.
And the data given is an HTML code string, such as this:
<ol class="dp-xml"> <li class="alt"><span><span class="tag"><span> </span><span class="tag-name">div</span><span> </span><span class="attribute">class</span><span>=”aaa”</span><span class="tag">></span></span></span></li> <li class="alt"><span><span class="tag"><span> </span><span class="tag-name">a</span><span> </span><span class="attribute">href</span><span>=”/aaa.php?</span><span class="attribute">id</span><span>=</span><span class="attribute-value">1</span><span>″</span><span class="tag">></span></span></span></li> <li class="alt"><span><span>张三</span></span></li> <li class="alt"><span><span class="tag"><span> /a</span><span class="tag">></span><span> </span></span></span></li> <li class="alt"><span><span>评论了 </span></span></li> <li class="alt"><span><span class="tag"><span> </span><span class="tag-name">a</span><span> </span><span class="attribute">href</span><span>=”/aaa.php?</span><span class="attribute">id</span><span>=</span><span class="attribute-value">444</span><span>″</span><span class="tag">></span></span></span></li> <li class="alt"><span><span>李四</span></span></li> <li class="alt"><span><span class="tag"><span> /a</span><span class="tag">></span><span> </span></span></span></li> <li class="alt"><span><span>分享的 </span></span></li> <li class="alt"><span><span class="tag"><span> </span><span class="tag-name">a</span><span> </span><span class="attribute">href</span><span>=”bbb.html”</span><span class="tag">></span></span></span></li> <li class="alt"><span><span>一篇文章文章一长串的东西</span></span></li> <li class="alt"><span><span class="tag"><span> /a</span><span class="tag">></span></span></span></li> <li class="alt"><span><span class="tag"><span> /div</span><span class="tag">></span><span> </span></span></span></li> </ol>
When PHP HTML intercepts the code, it needs to intercept the content inside the div tag, and the HTML tag must be retained, just for The text is processed. For example, I may just intercept the word "李" in "李思", but if I put it on the front end like this, the a tag in front of "李思" will not be closed, so after intercepting, I need to ensure that the HTML syntax is correct.
This problem is really not easy to solve, and it made me depressed for two days. Please note that this is just a string, but the content is HTML code and there is no DOM. It would be easier if it were processed on the front end. You can directly obtain the DOM, then process the nodes inside, and finally output things like innerHTML. It doesn't work now, I have to change my mind. My colleague’s idea is this:
Traverse each character of the string. Set a tag, and set it to 1 when it encounters the tag . When processing the string inside the label, you must first determine whether the current character encoding may be Chinese. Generally speaking, the length of UTF-8 encoded Chinese characters in PHP is 3, so if you encounter a Chinese character encoding , I have to skip two uncounted ones... At this point, my head is starting to get big. Personally, I think this method is very unpleasant. First of all, this kind of exquisite logic is not easy to control, and the length of Chinese generated under UFT-8 encoding may be 3 or 4, so the tightness of the code is questionable.
My personal idea is to use Tidy (please see the PHP manual for specific usage). I studied Tidy yesterday and found that this thing is quite useful. First, convert this string into a Tidy object, like this:
<ol class="dp-xml"> <li class="alt"><span><span>$</span><span class="attribute">tidy</span><span> = </span><span class="attribute-value">tidy_parse_string<br></span><span>($str, array(), ‘utf8′); </span></span></li> <li><span>// 最后一个是设置编码的,注意,<br>这里是utf8 ,不是utf-8,没有中间那个连线。 </span></li> </ol>
Then get the body in $tidy (because after conversion, $tidy will automatically add tags such as
) :$body = tidy_get_body($tidy);
At this time you can use var_dump to look at some $body structures, and you will find that it turns each tag into a corresponding object with corresponding attributes. . For example, sdf , some attributes corresponding to such a statement are:
name=>”a”
value => "sdf"
child=> array{[0]=>A text node object, value is sdf}
attribute=array{"href ”=>”#”}
…..Other attributes
As you can see, we can actually process the value of the text node under the corresponding node of the a tag separately, so that the PHP HTML interception code It won't break any HTML integrity. Originally, I thought that after changing the value of the text node in the a tag, the value of the a tag would also change accordingly. In that case, it would be OK if I directly returned the value of the node corresponding to the a tag. I didn't expect it to be like that. Alas, so I processed the text in it. Then you still have to spell out the new HTML yourself.
After knowing the structure of the Tidy object, everything is easy to handle. Just traverse all the nodes. For this requirement, it is to find the div tag and then start processing the nodes inside. The code is as follows:
<ol class="dp-xml"> <li class="alt"><span><span>if(mb_strwidth($subchild-</span><span class="tag">></span><span>value, <br>‘utf-8′) </span><span class="tag">></span><span>= $len) </span></span></li> <li><span>{ </span></li> <li class="alt"> <span>$subchild-</span><span class="tag">></span><span class="attribute">value</span><span> = </span><span class="attribute-value">mb_strimwidth<br></span><span>($subchild-</span><span class="tag">></span><span>value, 0, $len, ‘…', ‘utf-8′); </span> </li> <li> <span>$trimed_str </span><span class="attribute">.</span><span>= $subchild-</span><span class="tag">></span><span>value; </span> </li> <li class="alt"><span>break; </span></li> <li><span>} </span></li> <li class="alt"><span>else </span></li> <li><span>{ </span></li> <li class="alt"> <span>$trimed_str </span><span class="attribute">.</span><span>= $subchild-</span><span class="tag">></span><span>value; </span> </li> <li> <span>$</span><span class="attribute">len</span><span> = $len - mb_strwidth($subchild-</span><span class="tag">><br></span><span>value, ‘utf-8′); </span> </li> <li class="alt"><span>} </span></li> </ol>
The $subchild inside is a child node. Note that mb_strwidth is used here to obtain the string length. I strongly recommend this mb_strwidth, it is very useful, it will treat Chinese as two characters in length, which exactly meets the needs here! Moreover, mb_strimwidth is used when PHP HTML intercepts code. This function will also treat Chinese as two characters in length. The function starting with mb_ is really easy to use.
I won’t write out the specific PHP HTML interception code, because it is written based on a requirement and not made into a universal form. One day I have time to make it universal and publish it.
In addition, it is a pity that FireFox does not support the text-overflow attribute, otherwise there would be no need to work so hard in the background to truncate it. If you have a better method, please suggest it! Any help is greatly appreciated.

随着互联网的发展,SEO(SearchEngineOptimization,搜索引擎优化)已经成为了网站优化的重要一环。如果您想要使您的PHP网站在搜索引擎中获得更高的排名,就需要对SEO的内容有一定的了解了。本文将会介绍如何在PHP中实现SEO优化,内容包括网站结构优化、网页内容优化、外部链接优化,以及其他相关的优化技巧。一、网站结构优化网站结构对于S

随着电子商务和企业管理的发展,许多企业开始寻找更好的方法来处理其日常业务流程。ERP系统是一种能够整合企业各种业务流程的软件工具。它提供了全面的功能,包括生产、销售、采购、库存、财务等方面,帮助企业提高效率、控制成本和提高客户满意度。而在PHP编程语言中,也能够实现ERP系统,这就需要我们掌握一些基本的知识和技术。下面,我们将深入探讨如何在PHP中实现ERP

随着企业的发展,客户管理变得越来越重要。为了提高客户满意度和忠诚度,越来越多的企业采用客户关系管理系统(CRM)来帮助其管理客户关系。而PHP是一种流行的编程语言,因其简单易学、灵活和强大而被广泛应用于Web开发。那么,如何在PHP中实现CRM系统呢?本文将为您介绍实现CRM系统的步骤和技巧。Step1:需求分析在开始开发CRM系统之前,您需要进行需求分析

随着物联网技术的发展和普及,越来越多的应用场景需要使用PHP语言进行物联网开发。PHP作为一种广泛应用于Web开发的脚本语言,它的易学易用、开发速度快、可扩展性强等特点,使其成为开发物联网应用的一种优秀选择。本文将介绍在PHP中实现物联网开发的常用技术和方法。一、传输协议和数据格式物联网设备通常使用TCP/IP或UDP协议进行数据传输,而HTTP协议是一个优

哈医大临床药学就业前景如何尽管全国就业形势不容乐观,但药科类毕业生仍然有着良好的就业前景。总体来看,药科类毕业生的供给量少于需求量,各医药公司和制药厂是吸纳这类毕业生的主要渠道,制药行业对人才的需求也在稳步增长。据介绍,近几年药物制剂、天然药物化学等专业的研究生供需比甚至达到1∶10。临床药学专业就业方向:临床医学专业学生毕业后可在医疗卫生单位、医学科研等部门从事医疗及预防、医学科研等方面的工作。就业岗位:医药代表、医药销售代表、销售代表、销售经理、区域销售经理、招商经理、产品经理、产品专员、护

随着互联网的不断发展,越来越多的网站需要使用验证码来保证安全性。验证码是一种借助人类能力而无法被计算机破解的认证技术,广泛应用于网站注册、登录、找回密码等功能中。下面将介绍如何使用PHP实现验证码功能。一、生成验证码图片验证码图片的生成是验证码功能的核心,需要生成一个随机字符,并将其渲染为图像展示给用户。在PHP中,可以使用GD库来生成图片。GD库是一种用于

智能合约(SmartContract)是一种基于区块链的自动化交易程序,可以实现自动化执行、验证和执行交易。智能合约可以减少交易中的人为干扰,提高交易的安全性和效率。在不同的区块链中,智能合约的实现方式略有不同。本文将介绍在PHP中如何实现智能合约。PHP是一种广泛使用的编程语言,特别适合Web开发。PHP有着成熟的开源生态系统,以及许多可靠的框架和库。在

随着现代企业管理的需求与时俱进,各种管理软件如ERP、CRM、HRM和OA等软件的使用已经变得越来越普遍。特别是办公自动化(OA)软件,已经成为企业必不可少的一部分。随着PHP发展的越来越成熟,越来越多的企业开始使用PHP作为开发OA软件的工具,并取得了很好的效果。那么,在PHP中如何实现OA开发呢?确定OA的需求在开始OA的开发之前,必须先明确自己的OA需


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

Atom editor mac version download
The most popular open source editor

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.
