search
HomeBackend DevelopmentXML/RSS TutorialFrequently asked questions about getting started with XML (4)

 How to deal with whitespace characters in the xml object model?

  Sometimes, the XML object model will display TEXT nodes that contain whitespace characters. When whitespace characters are truncated, it's likely to cause some confusion. For example, the following XML example:


  ]>
 Smith
John

  generates the following tree:


 PRocessing Instruction: xml
 DocType: person
 ELEMENT: person
 TEXT:
 ELEMENT: lastname
TEXT:
ELEMENT : firstname
 TEXT:


 The first name and last name are surrounded by TEXT nodes containing only whitespace characters, because the content model of the "person" element is MIXED; it contains the #PCDATA keyword. The MIXED content model specifies that text can exist between elements. Therefore, the following is also correct:


My last name is Smith and my first name is John

The result is a tree similar to the following: ELEMENT: person
TEXT: My last name is

ELEMENT : lastname

   TEXT: and my first name is
  ELEMENT: firstname
  TEXT:


  Without the whitespace characters after and before the word "is", and the whitespace characters after and before the word "and", the sentence would not be understandable. . Therefore, for the MIXED content model, text combinations, whitespace characters, and elements are all relevant. This is not the case for non-MIXED content models.

 To make the TEXT node with only whitespace characters disappear, remove the #PCDATA keyword from the "person" element declaration:


 The result is the following clear tree:

 Processing Instruction: xml

 DocType: person

 ELEMENT: person

  ELEMENT: lastname
 ELEMENT: firstname

 What does an XML declaration do?

 The XML declaration must be listed at the top of the XML document:

 It specifies the following items:

 The document is an XML document. MIME detectors can use this to detect if a file is of type text/xml when the MIME type is missing or has not been specified.

 The document complies with the XML 1.0 specification. This will be important in the future if there are other versions of XML.

 Document character encoding. The encoding attribute is optional and defaults to UTF-8.

 Note: The XML declaration must be on the first line of the XML document, so the following XML file:


  produces the following parsing error:

 Invalid xml declaration.

  Line 0000002:

Location 0000007: ------^

Note: The XML declaration is optional. If you need to specify comments or processing instructions at the top, don't put an XML declaration. However, the default encoding will be UTF-8.


 How can I print my XML document in a readable format?

 When constructing a document from scratch using DOM to produce an XML file, everything is on one line with no spaces between each other. This is the default behavior.

 Construct the default XSL stylesheet in Internet Explorer 5 to display and print XML documents in a readable format. For example, if you have IE5 installed, try looking at the nospace.xml file. The following tree should be displayed in the browser:

  -

 -

 

  Printing readable XML is very interesting, especially when there are DTDs that define different types of content models. For example, under the mixed content model (#PCDATA) you cannot insert spaces as it might change the meaning of the content. For example consider the following XML:

 Elephant
 This would be better not to output as:

 E

 elephant

 because the word boundaries are no longer correct.

 All of these make automated printing problematic. If you don't need to print readable XML, you can use the DOM to insert whitespace characters as text nodes at appropriate locations.

 How to use namespaces in DTD? To use a namespace in a DTD, declare it in the ATTLIST declaration of the element that uses it, as follows:


 The namespace type must be #FIXED. The same goes for attribute namespaces:

 Namespaces and XML schemas

 DTD and XML schemas cannot be mixed. For example, the following xmlns:x CDATA #FIXED "x-schema:myschema.xml" will not cause the schema definition defined in myschema.xml to be used. The use of DTD and XML schemas are mutually exclusive.

 How to use XMLDSO in Visual Basic?

Use the following XML as an example:


Mark Hanson 206 765 4583

Jane Smith 425 808 1111 You can bind to an ADO recordset as follows:


 Create a new VB 6.0 project.

 Add references to Microsoft ActiveX Data Objects 2.1 or later, Microsoft Data Adapter Library, and Microsoft XML version 2.0.

Use the following code to load XML data into the XML DSO control:


  Dim dso As New XMLDSOControl
  Dim doc As IXMLDOMDocument
  Set doc = dso. :

  Dim da As New DataAdapter

  Set da.Object = dso

  Dim rs As New ADODB.Recordset
  Set rs.DataSource = da


  Access data:


MsgBox rs.Fields("name"). Value

 The result shows the string "Mark Hanson"
 How to use XML DOM in java?

 The IE5 version of MSXML.DLL must be installed. In Visual J++ 6.0, select Add COM Wrapper from the Project menu, and then select "Microsoft XML 1.0" from the COM object list. This will construct the required Java wrapper into a new package called "msxml". These pre-built Java wrappers are also available for download. Classes can be used as follows:

  import com.ms.com.*;

 import msxml.*;

 public class Class1
 {
 public static void main (String[] args)
 {
 DOMDocument doc = new DOMDocument ();
 doc.load(new Variant("file://d:/samples/ot.xml"));
 System.out.println("Loaded " + doc.getDocumentElement().getNodeName());
  }
 }


  The code example will load the 3.8MB test file "ot.xml" from the sun religion example. The Variant class wraps the Win32 VARIANT basic type.

 You cannot use pointer comparisons on nodes because you actually get a new wrapper every time you retrieve a node. So instead of using the following code,

IXMLDOMNode root1 = doc.getDocumentElement(); IXMLDOMNode root2 = doc.getDocumentElement(); if (root1 == root2)...



Code:

 if (ComLib.isEqualUnknown(root1, root2)) ....

  . The total size of the .class wrapper is approximately 160KB. However, for full compliance with the W3C specification, only IXMLDOM* wrappers should be used. The following classes are old IE 4.0 XML interfaces and can be removed from the msxml folder:

 _xml_error*


 This reduces the size to 147KB. You can also delete the following items:


 DOMFreeThreadedDocument
 Access XML documents from multiple threads in Java applications.
 xmlhttpRequest
 Use the XML DAV HTTP extension to communicate with the server.

 IXTLRuntime

  Define XSL stylesheet script object.

 XMLDSOControl

 Bind to XML data in an HTML page.

 XMLDOMDocumentEvents

 Return callbacks during the analysis process.



 This reduces the size to 116KB. To make it smaller, consider the fact that the DOM itself has two layers: The core layer consists of:


  DOMDocument, IXMLDOMDocument
 IXMLDOMNode*
 IXMLDOMNodeList*
 IXMLDOMNamedNodeMap*

 IXMLDOMDocumentFragment*

 IXMLDO MImplementation

  IXMLDOMParseError

  and users may need to retain DTD information:


 IXMLDOMDocumentType
IXMLDOMEntity
IXMLDOMNotation

All node types in an XML document are IXMLDOMNodes, which provide full functionality, but there are higher-level wrappers for each node type. Therefore, if you modify the DOMDocument wrapper and change these specific types to use IXMLDOMNode, then all the following interfaces can be removed: IXMLDOMAttribute IXMLDOMCDATASection IXMLDOMCharacterData IXMLDOMComment ProcessingInstruction

IXMLDOMEntityReference
IXMLDOMText


Deleting these will Reduce the size to 61KB. However, for IXMLDOMElement, both the getAttribute and setAttribute methods are useful. Otherwise, you need to use:

 IXMLDOMNode.getAttributes().setNamedItem(...)


The above is the content of the FAQ (4) for getting started with XML. For more related content, please pay attention to the PHP Chinese website (www.php.cn) !




Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Decoding RSS: An XML Primer for Web DevelopersDecoding RSS: An XML Primer for Web DevelopersMay 06, 2025 am 12:05 AM

RSS is an XML-based format used to publish frequently updated data. As a web developer, understanding RSS can improve content aggregation and automation update capabilities. By learning RSS structure, parsing and generation methods, you will be able to handle RSSfeeds confidently and optimize your web development skills.

JSON vs. XML: Why RSS Chose XMLJSON vs. XML: Why RSS Chose XMLMay 05, 2025 am 12:01 AM

RSS chose XML instead of JSON because: 1) XML's structure and verification capabilities are better than JSON, which is suitable for the needs of RSS complex data structures; 2) XML was supported extensively at that time; 3) Early versions of RSS were based on XML and have become a standard.

RSS: The XML-Based Format ExplainedRSS: The XML-Based Format ExplainedMay 04, 2025 am 12:05 AM

RSS is an XML-based format used to subscribe and read frequently updated content. Its working principle includes two parts: generation and consumption, and using an RSS reader can efficiently obtain information.

Inside the RSS Document: Essential XML Tags and AttributesInside the RSS Document: Essential XML Tags and AttributesMay 03, 2025 am 12:12 AM

The core structure of RSS documents includes XML tags and attributes. The specific parsing and generation steps are as follows: 1. Read XML files, process and tags. 2. Extract,,, etc. tag information. 3. Handle custom tags and attributes to ensure version compatibility. 4. Use cache and asynchronous processing to optimize performance to ensure code readability.

JSON, XML, and Data Formats: Comparing RSSJSON, XML, and Data Formats: Comparing RSSMay 02, 2025 am 12:20 AM

The main differences between JSON, XML and RSS are structure and uses: 1. JSON is suitable for simple data exchange, with a simple structure and easy to parse; 2. XML is suitable for complex data structures, with a rigorous structure but complex parsing; 3. RSS is based on XML and is used for content release, standardized but limited use.

Troubleshooting XML/RSS Feeds: Common Pitfalls and Expert SolutionsTroubleshooting XML/RSS Feeds: Common Pitfalls and Expert SolutionsMay 01, 2025 am 12:07 AM

The processing of XML/RSS feeds involves parsing and optimization, and common problems include format errors, encoding issues, and missing elements. Solutions include: 1. Use XML verification tools to check for format errors; 2. Ensure encoding consistency and use the chardet library to detect encoding; 3. Use default values ​​or skip the element when missing elements; 4. Use efficient parsers such as lxml and cache parsing results to optimize performance; 5. Pay attention to data consistency and security to prevent XML injection attacks.

Decoding RSS Documents: Reading and Interpreting FeedsDecoding RSS Documents: Reading and Interpreting FeedsApr 30, 2025 am 12:02 AM

The steps to parse RSS documents include: 1. Read the XML file, 2. Use DOM or SAX to parse XML, 3. Extract headings, links and other information, and 4. Process data. RSS documents are XML-based formats used to publish updated content, structures containing, and elements, suitable for building RSS readers or data processing tools.

RSS and XML: The Cornerstone of Web SyndicationRSS and XML: The Cornerstone of Web SyndicationApr 29, 2025 am 12:22 AM

RSS and XML are the core technologies in network content distribution and data exchange. RSS is used to publish frequently updated content, and XML is used to store and transfer data. Development efficiency and performance can be improved through usage examples and best practices in real projects.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use