Home  >  Article  >  Backend Development  >  XML Guide - XML ​​Coding

XML Guide - XML ​​Coding

黄舟
黄舟Original
2017-02-11 15:14:051945browse


XML documents can contain foreign characters such as Norwegian or French (Chinese of course! This part still cannot be translated according to the original text, some of the following content is written by myself)
In order to allow your parser to understand these characters , you must unify character encoding standards in XML documents.



Windows 95/98 Notepad
Windows 95/98 Notepad cannot save files in Unicode encoding format.
You can use Notepad to edit and save XML documents containing foreign characters (for example: Norwegian or French or Chinese)

<?xml version="1.0"?> 
<note> 
<from>小奀</from> 
<to>小林</to> 
<message>晚上一起去火锅呀</message> 
</note>


But if you open this with a browser, use Notepad This edited XML document will have an error opening it with IE 5.0.



Using encoding in Windows 95/98 Notepad
The encoding attribute must be set when editing XML files in Windows 95/98 Notepad.
In order to avoid errors, you can add an encoding attribute to the XML document declaration to indicate the encoding type of this XML document, but do not use Unicode encoding.
The following encoding types will not cause errors, and Chinese characters will display normally:

<?xml version="1.0" encoding="gb2312"?>




The following encoding types will not cause errors, and Chinese characters will display normally:

<?xml version="1.0" encoding="gbk"?>




The following encoding type will not cause errors, and Chinese characters will not be displayed normally (garbled characters):

<?xml version="1.0" encoding="windows-1252"?>




The following encoding types will not cause errors, and Chinese characters are displayed abnormally (garbled characters):

<?xml version="1.0" encoding="ISO-8859-1"?>




The following encoding types will not cause errors Error, and Chinese characters are displayed normally:

<?xml version="1.0" encoding="UTF-8"?>




The following encoding types will cause errors:

<?xml version="1.0" encoding="UTF-16"?>


Using Windows 2000 Notepad
Windows 2000 Notepad can save files in Unicode encoding format.
Windows 2000 Notepad supports Unicode character set. If you use Win2000 Notepad to save the XML document in Unicode encoding format (please note that there is no encoding information in the XML declaration):

<?xml version="1.0"?> 
<note><from>小奀</from><to>小林</to><message>晚上一起去火锅呀</message></note>


The following file; note_encode_none_u.xml, in IE5 .0+ will not cause errors, but if you use Netscape 6.2, errors may occur. Readers compare the two files note_encode_none.xml and note_encode_none_u.xml. If they look at their respective source files separately, there is no difference, but why can one be displayed and the other cannot be displayed? The answer is the Unicode character set.

Windows 2000 Notepad Encoding
Windows 2000 Notepad can also save files in "UTF-16" encoding format.
If you declare the encoding attribute in the XML document and save the file in Unicode encoding format, an error may occur.
The following code will cause an error:

<?xml version="1.0" encoding="windows-1252"?>




The following code will cause an error:

<?xml version="1.0" encoding="ISO-8859-1"?>




The following code will cause an error:

<?xml version="1.0" encoding="UTF-8"?>




The following file; note_encode_utf16_u.xml, will be in IE5.0+ The display is normal, but an error will occur in Netscape 6.2 browser.

<?xml version="1.0" encoding="UTF-16"?>



Error message
When browsing XML documents using IE5.0 or higher, you may encounter two different encoding errors:
In An invalid character was found in text content.
If your XML document does not match the encoding format of your XML document, an error may occur. Usually, the XML document contains some "non-English" characters, and a single-byte encoding editor is used, and the encoding format of the XML document is not set in the declaration of the XML document.

Switch from current encoding to specified encoding is not supported.
If the XML document is saved in Unicode/UTF-16 encoding format, but the declaration of the XML document sets the encoding format to be some single-byte encoding (such as Windows-1252, ISO-8859-1 or UTF-8) ; Or the XML document is saved in a single-byte encoding format, but the declaration of the XML document sets the encoding format to be some Unicode/UTF-16 encoding form. In both cases, errors will occur.

Conclusion
Conclusion: Before saving the XML document, set the encoding format of the document in the declaration of the XML document. Some of my suggestions to avoid errors:
Use an encoding format that supports Unicode editor.
Make sure you know which encoding format you are using.
Use attribute declarations to set the encoding format in XML documents. ​​​​​​

The above is the XML guide - the content of XML encoding. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn