Home  >  Article  >  Java  >  Detailed introduction to UrlDecoder and UrlEncoder in Java

Detailed introduction to UrlDecoder and UrlEncoder in Java

黄舟
黄舟Original
2017-07-24 15:35:542262browse

Utility class for HTML format encoding. This class contains static methods for converting a String to application/x-www-form-urlencoded MIME format. The following is an introduction to the knowledge of UrlDecoder and UrlEncoder in Java through example code. Friends who are interested should take a look together

一URLEncoder

Utility class for HTML format encoding. This class contains static methods for converting a String to application/x-www-form-urlencoded MIME format. For more information about HTML format encoding, see the HTML specification.

When encoding a String, use the following rules:

Alphanumeric characters "a" to "z", "A" to "Z" and "0" to "9" remain unchanged .

The special characters ".", "-", "*" and "_" remain unchanged.

The space character " " is converted into a plus sign "+".

All other characters are unsafe, so they are first converted to one or more bytes using some encoding mechanism. Each byte is then represented by a 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme is UTF-8. However, for compatibility reasons, if an encoding is not specified, the default encoding for the corresponding platform is used.

For example, using the UTF-8 encoding mechanism, the string "The string ü@foo-bar" will be converted to "The+string+%C3%BC%40foo-bar" because in UTF- 8, the character ü is encoded as two bytes, C3 (hex) and BC (hex), and the character @ is encoded as one byte 40 (hex).

2 URLDecoder

This class contains static methods for decoding String from application/x-www-form-urlencoded MIME format.

This conversion process is exactly the opposite of the process used by the URLEncoder class. It is assumed that all characters in the encoded string are one of the following: "a" through "z", "A" through "Z", "0" through "9", and "-", "_", "." as well as"*". The "%" character is allowed, but is interpreted as the beginning of a special escape sequence.

The following rules are used in conversion:

Alphanumeric characters "a" to "z", "A" to "Z" and "0" to "9" remain unchanged.

The special characters ".", "-", "*" and "_" remain unchanged.

The plus sign "+" is converted into the space character " ".

The "%xy" ​​format sequence will be treated as one byte, where xy is an 8-bit two-digit hexadecimal representation. Then, all substrings containing one or more of these byte sequences consecutively are replaced by characters whose encoding yields these consecutive bytes. The encoding mechanism for decoding these characters can be specified, or if not specified, the platform's default encoding mechanism is used.

The decoder has two possible ways to handle illegal strings. One way is to ignore the illegal character, the other way is to throw an IllegalArgumentException exception

Simple example:

Java code


try { 
      String encodeStr = URLEncoder.encode("中国", "utf-8"); 
      System.out.println("处理后:" + encodeStr); 
      String decodeStr = URLDecoder.decode(encodeStr, "utf-8"); 
      System.out.println("解码:" + decodeStr); 
    } catch (UnsupportedEncodingException e) { 
      // TODO Auto-generated catch block 
      e.printStackTrace(); 
    }

Run result:

Java code


处理后:%E4%B8%AD%E5%9B%BD 
解码:中国

The above is the detailed content of Detailed introduction to UrlDecoder and UrlEncoder in Java. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn