XML technical m...login
XML technical manual
author:php.cn  update time:2022-04-14 15:57:53

XML encoding



XML documents can contain non-ASCII characters, such as æ ø å in Norwegian, or ê è é in French.

To avoid errors, you need to specify the XML encoding or save the XML file as Unicode.


XML encoding errors

If you load an XML document, you can get two different errors indicating encoding problems:

In the text Invalid characters found in content.

If your XML contains non-ASCII characters and the file is saved as single-byte ANSI (or ASCII) without a specified encoding, you will get an error.

XML file with single-byte encoding attributes.

The same single-byte XML file without encoding attribute.

Switch the current encoding to the specified encoding that is not supported

If your XML file is saved with the specified single-byte encoding (WINDOWS-1252, ISO -8859-1, UTF-8), you will get an error.

You will also get an error if your XML file is saved as single-byte ANSI (or ASCII) with the specified double-byte encoding (UTF-16).

Double-byte XML file without encoding.

The same double-byte XML file with single-byte encoding.


Windows Notepad

Windows Notepad will save files as single-byte ANSI (ASCII) by default.

If you select "Save as..." you can specify ANSI, UTF-8, Unicode (UTF-16), or Unicode Big.

Save the following XML as ANSI, UTF-8, and Unicode (note that the document does not contain any encoding properties).

<?xml version="1.0"?>
<note>
<from>Jani</from>
<to>Tove</to>
<message>Norwegian: æøå. French: êèé</message>
</note>

Try dragging the file to your browser and see the results. Different browsers will display different results.

Experience of different encodings:

<?xml version="1.0" encoding="us-ascii"?>
<?xml version="1.0 " encoding="windows-1252"?>
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding=" UTF-8"?>
<?xml version="1.0" encoding="UTF-16"?>

Please try:

with the correct Encoded save

Save with wrong encoding



Conclusion

  • Always use the encoding attribute

  • Use an editor that supports encoding

  • Make sure you know what encoding your editor uses

  • in your encoding properties Use the same encoding

php.cn