Detailed explanation of encoding in xml-XML/RSS Tutorial-php.cn

Home

Backend Development

XML/RSS Tutorial

Detailed explanation of encoding in xml

黄舟

Mar 22, 2017 pm 04:57 PM

The same day ago, I was discussing with my colleagues the relationship between the encoding attribute in xml and the file format, and finally understood it thoroughly.
It was previously understood that the encoding definition in xml must match the file format. That is, if there is such an XML Introduction . (I later found out that FF FE is not the BOM of utf-8... which means that my misunderstanding lasted for quite a while...)
Let's briefly talk about the several stages of the discussion.
At the beginning of the discussion, I told him for sure that the encoding value must match the file format (ie BOM, BOM is the abbreviation of byte order mark), otherwise when parsing XML, errors may occur (for example, the document contains A certain UNICODE character, and the format specified by encoding or BOM does not match, an error will occur. This is what I meant at the time), and then he told me that it seemed not to be the case. The XML file I created with DELPHI did not have a BOM in the XML. There is Chinese content, and the encoding specified is UTF-8. It can be opened normally with IE.
When he discovered that the XML file he created did not have a BOM, an interesting thing was that when using UE to open such files containing UNICODE characters, UE will automatically add FF FE in front of the file so that the file can be displayed normally. , so if you browse a file that originally does not have a BOM in hexadecimal under UE, you will see an additional BOM. This function can be removed in the OPTIONS of UE. If you want to know, you can find it yourself.
Then I got a little confused, how could this happen, and then I thought and thought, and suddenly he sent a message with the following content:

W3C defines three pieces of XML Rules for how the parser correctly reads the encoding of XML files:
1. If the document has a BOM (Byte Order Mark, generally speaking, if it is saved in unicode format, it contains the BOM, but ANSI does not) , the file encoding is defined
2. If there is no BOM, check the encoding attribute of the XML declaration
3. If there are neither of the above, it is assumed that the XML document is encoded in UTF-8

With these three rules, this rule will be much clearer.
First, the XML parser parses the file according to the BOM of the file; if the BOM is not found, the encoding specified by the encoding attribute in XML is used; if the encoding is not specified in XML, utf-8 is used by default. Parse the document. Then it can be launched. If there are both BOM and ENCODING, the one specified by BOM shall prevail.
ah! Suddenly I felt how great it would be to have standard documents! Although it is so natural.
At this point, I finally understand the relationship between encoding and file format in xml. Although this record only contains a few hundred words, when we were discussing it, the total time spent was almost 2 hours.

The above is the detailed content of Detailed explanation of encoding in xml. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

RSS Feeds: Exploring XML's Role and PurposeApr 28, 2025 am 12:06 AM

XML's role in RSSFeed is to structure data, standardize and provide scalability. 1.XML makes RSSFeed data structured, making it easy to parse and process. 2.XML provides a standardized way to define the format of RSSFeed. 3.XML scalability allows RSSFeed to add new tags and attributes as needed.

Scaling XML/RSS Processing: Performance Optimization TechniquesApr 27, 2025 am 12:28 AM

When processing XML and RSS data, you can optimize performance through the following steps: 1) Use efficient parsers such as lxml to improve parsing speed; 2) Use SAX parsers to reduce memory usage; 3) Use XPath expressions to improve data extraction efficiency; 4) implement multi-process parallel processing to improve processing speed.

RSS Document Formats: Exploring RSS 2.0 and BeyondApr 26, 2025 am 12:22 AM

RSS2.0 is an open standard that allows content publishers to distribute content in a structured way. It contains rich metadata such as titles, links, descriptions, release dates, etc., allowing subscribers to quickly browse and access content. The advantages of RSS2.0 are its simplicity and scalability. For example, it allows custom elements, which means developers can add additional information based on their needs, such as authors, categories, etc.

Understanding RSS: An XML PerspectiveApr 25, 2025 am 12:14 AM

RSS is an XML-based format used to publish frequently updated content. 1. RSSfeed organizes information through XML structure, including title, link, description, etc. 2. Creating RSSfeed requires writing in XML structure, adding metadata such as language and release date. 3. Advanced usage can include multimedia files and classified information. 4. Use XML verification tools during debugging to ensure that the required elements exist and are encoded correctly. 5. Optimizing RSSfeed can be achieved by paging, caching and keeping the structure simple. By understanding and applying this knowledge, content can be effectively managed and distributed.

RSS in XML: Decoding Tags, Attributes, and StructureApr 24, 2025 am 12:09 AM

RSS is an XML-based format used to publish and subscribe to content. The XML structure of an RSS file includes a root element, an element, and multiple elements, each representing a content entry. Read and parse RSS files through XML parser, and users can subscribe and get the latest content.

XML's Advantages in RSS: A Technical Deep DiveApr 23, 2025 am 12:02 AM

XML has the advantages of structured data, scalability, cross-platform compatibility and parsing verification in RSS. 1) Structured data ensures consistency and reliability of content; 2) Scalability allows the addition of custom tags to suit content needs; 3) Cross-platform compatibility makes it work seamlessly on different devices; 4) Analytical and verification tools ensure the quality and integrity of the feed.

RSS in XML: Unveiling the Core of Content SyndicationApr 22, 2025 am 12:08 AM

The implementation of RSS in XML is to organize content through a structured XML format. 1) RSS uses XML as the data exchange format, including elements such as channel information and project list. 2) When generating RSS files, content must be organized according to specifications and published to the server for subscription. 3) RSS files can be subscribed through a reader or plug-in to automatically update the content.

Beyond the Basics: Advanced RSS Document FeaturesApr 21, 2025 am 12:03 AM

Advanced features of RSS include content namespaces, extension modules, and conditional subscriptions. 1) Content namespace extends RSS functionality, 2) Extended modules such as DublinCore or iTunes to add metadata, 3) Conditional subscription filters entries based on specific conditions. These functions are implemented by adding XML elements and attributes to improve information acquisition efficiency.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

1 months agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

1 months agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

InZoi: How To Apply To School And University

3 weeks agoByDDD

Hot Tools

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Notepad++7.3.1

Easy-to-use and free code editor

Zend Studio 13.0.1

Powerful PHP integrated development environment

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Atom editor mac version download

The most popular open source editor

Hot Topics

Where is the login entrance for gmail email?

7797

1644

1402

1299

1234