Home  >  Article  >  Backend Development  >  Detailed analysis of php trim function

Detailed analysis of php trim function

不言
不言forward
2019-03-01 13:26:093284browse

This article brings you a detailed analysis of the php trim function. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. Helps.

String processing should be the most common in any program. PHP's trim function is used to remove strings from strings. The most commonly used one is to remove spaces. But, is this simple function really as simple as you think?

The trim function is defined as follows:

trim is removed from both sides, ltrim is removed from the left, and rtrim is removed from the right. In the PHP source code, they are ultimately processed through a function. So what I'm talking about about trim is the unified trim within PHP.

The source code is in the php_trim function in ext/standard/string.c.

trim function processing logic:

1. Determine whether the removal content what is set. If it is not set, the default string will be removed.

2. Determine the length of the removal content, divided into 1 characters, multiple characters removal

3. Use model to perform bitwise AND operations with 1 and 2 respectively to determine whether to perform left and right removal

If one character is removed:

For removal on the left, traverse each character of the string, use the position of the first character that is not equal to what as the starting position of the new string, and update the length

For removal on the right, start traversing from the right , find the first character that is not equal to what, and subtract the number of traversals from the length of the string.

At this point, the starting position of the new string is determined, and the length is determined. Then execute the string assignment copy command to return the string after removal.

Removal of multiple strings :

First use a mask data to mark the strings that need to be removed (mask can be understood as a hash table with character ascii values ​​as key values). Then the operation is similar to removing a character, except that the end condition is to find the first element that is not in the character table.

Default:

The processing method is the same as before, except that the content is limited to characters whose ascii code is less than 32 (i.e., spaces). And only remove the '\r', '\t', '\v', '\0', '\n' characters

Seeing this, we learned the following points:

1. Trim removes '\r', '\t', '\v', '\0', '\n' by default

2. Trim gives a single character as a loop operation , the loop end condition is the first unequal character

3, trim multiple character removal, it is a loop removal until the first character not in the list is encountered.

Let’s take a look at the php_charmask function

You don’t need to look at the omitted part in the middle, it’s just an error return for illegal data.

Just look at the content of the first if. If the string is passed in, what='a...f' is assumed. The input pointer points to a. At this time, the if condition is satisfied, and the operation performed inside is equivalent to adding the contents of a, b, c, d, e, and f to the mask. Therefore, the trim ('abcdefg', 'a...f') can be specified to remove the interval. The returned content is only g.

The following actual outputs are easier to understand:

1. trim('abcdf', 'fd'); output abc, trim is not in order, as long as it is in the list, all Remove

2. trim('abccdffff', 'f'); Output abccd, trim will remove all those that meet the conditions

3. trim('abcdffff', 'a…d' ); Output content ffff, trim can specify the interval, but if you really want to remove 'a...d', you cannot use trim.

trim removes the nature of the list, which is useful when processing multi-bytes There will be problems, which is why trim will produce garbled characters for Chinese.

trim('品,' , ', '),' The hexadecimal representation of the 'product' UTF character is 'e5 93 81', and the hexadecimal representation of the string ', ' is 'e3 80 81 '. In trim, calculated in bytes, 3 bytes of UTF8 Chinese encoding represent a Chinese character. Therefore, it is equivalent to trim removing three characters. The hexadecimal representation of these three characters is 'e3 80 81'. So the hexadecimal representation of the final returned string is 'e5 93', because 81 has been removed.

trim(‘,’, ‘,’) will return the correct result. Because the hexadecimal representation of '' is 'e7 9a 84'.

So trim is not simple. Always remember that trim removes all characters in the list and stops at the first non-list character! !

The above is the detailed content of Detailed analysis of php trim function. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete