Home  >  Article  >  Backend Development  >  Implementation method of php Chinese url transcoding

Implementation method of php Chinese url transcoding

藏色散人
藏色散人Original
2020-07-21 10:52:403600browse

In PHP, you can use the urlencode function or rawurlencode function to transcode the URL. The syntax is "urlencode (string str)" and "rawurldecode (string str)" respectively.

Implementation method of php Chinese url transcoding

php Chinese url transcoding

To encode the URL in PHP, you can use urlencode() or rawurlencode(), the difference between the two is that the former encodes spaces as ' ', while the latter encodes spaces as ' ', but it should be noted that only part of the URL should be encoded when encoding, otherwise the Colons and backslashes are also escaped.

The following is a detailed explanation:

string urlencode ( string str)

Returns a string. All non-alphanumeric characters in this string except -_. will be replaced with a percent sign (%) followed by Two hexadecimal digits, spaces are encoded as plus signs ( ). This encoding is the same as the encoding of WWW form POST data, and the same encoding as the application/x-www-form-urlencoded media type. For historical reasons, this encoding differs from the RFC1738 encoding (see rawurlencode()) in encoding spaces as plus signs ( ). This function makes it easy to encode a string and use it in the request part of the URL, and it also makes it easy to pass variables to the next page:

Recommended: "PHP Tutorial"

Example 1. urlencode() Example

<?php
echo &#39;<a href="mycgi?foo=&#39;, urlencode($userinput), &#39;">&#39;;
?>

Note: Be careful with variables that match HTML entities. Characters like &, © and £ will be parsed by the browser and the actual entity will be used instead of the expected variable name. This is obvious confusion, and the W3C has been warning people about it for several years. Reference address: http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2 PHP supports changing the parameter separator to the semicolon recommended by W3C through the arg_separator .ini directive. Unfortunately most user agents do not send form data in semicolon delimited format. A simpler solution is to use & instead of & as the delimiter. You don't need to modify PHP's arg_separator for this. Leave it still & and just use htmlentities(urlencode($data)) to encode your URL.

Example 2. urlencode() and htmlentities() Example

<?php
echo &#39;<a href="mycgi?foo=&#39;, htmlentities(urlencode($userinput)), &#39;">&#39;;
?>

string urlencode (string str)

Returns the string in this string All non-alphanumeric characters except -_. are replaced with a percent sign (%) followed by two hexadecimal digits. This encoding, described in RFC 1738, is intended to protect literal characters from being interpreted as special URL delimiters, and to protect the URL format from being garbled by character conversions used by the transport medium (like some mail systems). For example, if you want to include the password in the FTP URL:

Example 1. rawurlencode() Example 1

<?php
echo &#39;<a href="ftp://user:&#39;, rawurlencode(&#39;foo @+%/&#39;),
   &#39;@ftp.my.com/x.txt">&#39;;
?>

Or, if you want to pass the URL's PATH_INFO composition Partially pass the information:

Example 2. rawurlencode() Example 2

<?php
echo &#39;<a href="http://x.com/department_list_script/&#39;,
   rawurlencode(&#39;sales and marketing/Miami&#39;), &#39;">&#39;;
?>

When decoding, you can use Correspondingly urldecode() and rawurldecode(), accordingly, rawurldecode() does not decode the plus sign (' ') into a space, while urldecode() can. Here is the detailed example:

string urldecode ( string str)

Decodes any %## in the given encoded string. Returns the decoded string.

Example 1. urldecode() example

<?php
$a = explode(&#39;&&#39;, $QUERY_STRING);
$i = 0;
while ($i < count($a)) {
   $b = split(&#39;=&#39;, $a[$i]);
   echo &#39;Value for parameter &#39;, htmlspecialchars(urldecode($b[0])),
   &#39; is &#39;, htmlspecialchars(urldecode($b[1])), "<br />\n";
   $i++;
}
?>

string rawurldecode (string str)

Returns a string, this character Any sequence of percent signs (%) followed by two hexadecimal digits in the string will be replaced with literal characters.

Example 1. rawurldecode() Example

<?php
echo rawurldecode(&#39;foo%20bar%40baz&#39;); // foo bar@baz
?>

However, one thing to note is that the string decoded by urldecode() and rawurldecode() is UTF -8 format encoding, if the URL contains Chinese and the page setting is not UTF-8, the decoded string must be converted before it can be displayed normally!

There is another problem, that is, the URL obtained is not in the format of %%nn n={0..F}, but in the format of %unnnn n={0..F}. Use it at this time urldecode() and rawurldecode() cannot be decoded correctly, and the following function must be used to decode correctly:

function utf8RawUrlDecode ($source)
{
    $decodedStr = "";
    $pos = 0;
    $len = strlen ($source);
    while ($pos < $len) {
        $charAt = substr ($source, $pos, 1);
        if ($charAt == &#39;%&#39;) {
            $pos++;
            $charAt = substr ($source, $pos, 1);
            if ($charAt == &#39;u&#39;) {
                // we got a unicode character
                $pos++;
                $unicodeHexVal = substr ($source, $pos, 4);
                $unicode = hexdec ($unicodeHexVal);
                $entity = "&#". $unicode . &#39;;&#39;;
                $decodedStr .= utf8_encode ($entity);
                $pos += 4;
            }
            else {
                // we have an escaped ascii character
                $hexVal = substr ($source, $pos, 2);
                $decodedStr .= chr (hexdec ($hexVal));
                $pos += 2;
            }
        } else {
            $decodedStr .= $charAt;
            $pos++;
        }
    }
    return $decodedStr;
} 

The above is the detailed content of Implementation method of php Chinese url transcoding. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn