XML is the SGML of the Web, but it hasn't become as visible on the Web as the XML community has. XML's most prominent achievement on the Web -- XHTML -- has been entangled in politics and committee design, and other ambitious, technically sound specifications -- such as XForms and SVG -- have been plagued by low usage. Sometimes XML succeeds on the Web in unexpected ways, including the popularity of XML-formatted Web feeds (such as RSS types and Atom).
Commonly used abbreviations
Ajax: Asynchronous JavaScript + XML
API: Application Programming Interface
CSS: Cascading Style Sheets
DOM: Document Object Model
HTML: Hypertext Markup Language
RSS: Really Simple Aggregation
SGML: Standard Generalized Markup Language
SVG: Scalable Vector Graphics
URI: Uniform Resource Identifier
URL: Uniform Resource Locator
W3C: World Wide Web Consortium
XHTML: Extensible Hypertext Markup Language
XML: Extensible Markup Language
Like other technologies on the Web, XML on the Web is browser-centric, but most discussions about processing XML on the Web focus on the server side. In the developerWorks Firefox and XML series, I cover several ways to use XML in the Firefox browser. Unfortunately, processing XML across browsers is even stranger than processing HTML across browsers, which is part of the reason why so much XML processing on the web sticks to the relatively safe realm of the server-side.
Many dynamic HTML developers are tired of the cross-browser pain and quirks of scripting between browsers. The emergence of several excellent JavaScript libraries has made the life of developers easier. One of the most popular of these libraries is jQuery, which has been covered in several articles on developerWorks. If you know how to get around these huge pitfalls, you can also use jQuery to process XML. In this article, I'll show you how to use jQuery and XML together in a real-world scenario, how to use Atom web feeds, introduce a practical pattern for working with XML in jQuery, and solve unfortunate real-world problems. You need a basic understanding of XML, XML namespaces, HTML, JavaScript, and the jQuery library.
XML namespace issues
I'll cover the most serious issues first. jQuery doesn't completely solve the XML namespace problem. This well-known problem has been around for a long time, and various solutions have been tried with unsatisfactory results. The ideal solution might be to leverage jQuery's support for CSS Level 3 namespace selectors, which will add a new selector like this:
@namespace ex url(http://example.com);
ex|quote {font-weight: bold}
The first line is the prefix declaration of the http://example.com namespace, and the second line is a type selector using the new namespace component, which is separated by a vertical bar symbol. Declared prefixes and local names. Unfortunately, jQuery doesn't support this approach, so various approaches have been taken to deal with namespace issues.
The Importance of Prefixes
One of the most common hacks is to ignore the namespace when handling XML and namespaces in jQuery, and choose the full qname (prefix and local part).
$(xml).find("x\:quote").each(function() {
//process each node
});
This code selects through jQuery’s node name concept, that is, the DOM nodeName attribute. It contains a colon, which is a reserved symbol for jQuery selectors and must be escaped with a backslash. Backslashes are reserved symbols for JavaScript scripts and must be in pairs. This hack doesn't work in namespace equivalent documents using different prefixes.
Using attribute filters
It is said that some people have successfully used a variation of the following method, that is, using the jQuery attribute filter on the pseudo attribute nodeName:
$(xml).find("[nodeName=x:quote]") .each(function() {
//process each node
});
When using jQuery versions prior to 1.3.x, you need to add @ in front of the nodeName. However, doing so suffers from the same basic problem as the approach mentioned in the previous section, The importance of masquerading prefixes. It will break many real namespace scenarios. I tried the following variation, which makes more sense:
$(xml).find("[namespaceURI='http://example.com'][localName='quote']")
.each(function () {
//process each node
});
Unfortunately this doesn’t work.
Looking for a good plugin
这种混乱不完全是 jQuery 的错。DOM 为寻找节点提供了有效的方法:getElementsByTagName 和 getElementsByTagNameNS。后者旨在感知名称空间,接受名称空间的 URI 并忽略前缀,但遗憾的是,尽管其他浏览器都支持它,但 Microsoft® Internet Explorer® 除外。然而,jQuery 的目的是处理此类浏览器问题,以便消除人们的此类烦恼。一种可能的、牵强的理由是,jQuery 很大程度上以 CSS 作为其选择器的基础,并且即使是 W3C CSS Level 3 名称空间选择器也无法使它通过工作草案阶段。jQuery bug #155,“Get Namespaced Elements in XML Documents”,涵盖了这些问题,但是问题在 3 年之内没有得到解决。
Ryan Kelly 遇到此问题并做了一次大胆的尝试,为 XML Namespace Selector 创建了一个 jQuery 插件 jquery.xmlns.js。它试图支持以下代码。
$.xmlns["ex"] = "http://example.com";
$(doc).find("ex|quote").each(...);
第一行是对该插件的全局名称空间声明 — 由于底层 jQuery 机制的局限性。它的确用典型的 jQuery 用语为名称空间范围提供一个非全局块。 遗憾的是,我在使用这种扩展时成败参半。我希望它能够改变,并最终找到合适的方法进入 jQuery 。
一个更简单的插件
我最终选择的解决方案是创建一个简单插件,它不使用 jQuery 选择器做任何特殊工作,而是添加一个新的过滤器。您可以直接传递一个名称空间和本地名称到该过滤器,从而使结果集与节点匹配。请您按以下方法使用它:
$(xml).find('*').ns_filter('http://example.com', 'quote').each(function(){
.each(function() {
//process each node
});
ns_filter 是我写的特殊过滤器。执行一个单独的 find('*') 的需求看起来可能不优雅,更简单的变化可能是:
$(xml).find('quote').ns_filter('http://example.com').each(function(){
.each(function() {
//process each node
});
然而,这样做并不可行,因为您不能相信 jQuery 能够以名称空间中立(即作为本地名称选择器)的方式来处理查询,例如 find('quote')。我的过滤器实现将在下一节提供,作为安装 jQuery 来处理 XML 的一般系统的一部分。我在 Mac OS X Snow Leopard 操作系统下的 Firefox 3.5.5 和 Safari 4.0.4 ,以及 Windows® XP 操作系统最新的 Internet Explore 7 和 Internet Explorer 8 浏览器中对它进行了测试。
jQuery XML 工作台
名称空间问题只是以下事实的症状:说到底,jQuery 是一个 HTML 工具。我发现,使用 jQuery 处理 XML 最实用的方式就是为 XML 文档创建一个 HTML 工作台,通过可靠地跨浏览器方法引用脚本,然后建立需要的暂时性解决方案,例如针对 XML 名称空间问题的解决方案。您可以用工作台模式准备并测试您基于浏览器的 XML 处理的模式和技术,您甚至还可以把工作台作为基于浏览器的应用程序本身的基础。
清单 1 (quotes.html)是 HTML 使用工作台的简单例子。它能够动态地从 XML 文件加载引用。
清单 1 (quotes.html). 使用 jQuery XML 工作台的 HTML 例子
You need the script element to automatically load jQuery, workbench JavaScript, and your application-specific scripts. You also need a link element to identify the XML file used by target_XML. If you need to work with multiple XML files, it's easy to extend your workbench setup. Listing 2 (workbench.js) is the workbench script.
Listing 2 (workbench.js). jQuery XML Workbench JavaScript
/*
workbench.js
*/
// The jQuery hook invoked once the DOM is fully ready
$(document).ready(function() {
//Get the target XML file contents (Ajax call)
var fileurl = $("link[rel='target_XML']").attr('href');
$.ajax({
url: fileurl,
type: "GET",
dataType: "xml",
complete: xml_ready,
error: error_func
});
});
// Callback for when the Ajax call results in an error
function error_func(result ) {
alert(result.responseText);
}
//ns_filter, a jQuery extension for (this).filter(function() {
var domnode =$(this)[0];
return (domnode.namespaceURI ==namespaceURI &&domnode.localName ==localName);
});
};
} )(jQuery);
Listing 3. (quotes.js) Application code for dynamic quote viewer
quotes.js
*/function xml_ready(result){
var target area for inserting data is clear
').each(function(){
var quote_text =$(this).text()
$('
Listing 4 (quotes1.xml) is an XML file with a quote list.
Listing 4. (quotes1.xml) with a quote list XML file
Please note that I used the x prefix, which means that, in theory, I could try the prefix-based hack mentioned above, but if I do, it will break. Replace it with the quotes file from Listing 5 (quotes2.xml), which is the exact same namespace as Listing 4, and the same Canonical XML.
Listing 5. (quotes2.xml) An XML file equivalent to Listing 4, with a quote list
Words have meaning and names have power
Sticks and stones will break my bones, but names will never hurt me.
The beginning of wisdom is to call things by their right names.
Better to see the face than to hear the name. ;如果您替代 清单 1 中的 quotes2.xml,您将发现它还起作用,这是一个针对名称空间的重要测试。图 1 是 quotes.html 的浏览器显示。
图 1. 使用 jQuery XML 工作台展示的引用
Atom XML 的动态显示一旦您开始在 jQuery 中进行 XML 处理,您就能够处理更多有用的 XML 格式,包括 Web 提要格式,例如 RSS 和 Atom。在此部分我将使用 jQuery XML 工作台来显示来自一个 Web 页面上 Atom 提要的最新条目。清单 6 是 HTML 页面。
清单 6. (home.html)托管动态 XML 的 Web 页面
jQuery XML workbench
Caesar's home page
GALLIA est omnis divisa in partes tres, quarum unam incolunt Belgae,
aliam Aquitani, tertiam qui ipsorum lingua Celtae, nostra Galli
appellantur. Hi omnes lingua, institutis, legibus inter se differunt.
Gallos ab Aquitanis Garumna flumen, a Belgis Matrona et Sequana dividit.
Horum omnium fortissimi sunt Belgae, propterea quod a cultu atque
humanitate provinciae longissime absunt, minimeque ad eos mercatores saepe
commeant atque ea quae ad effeminandos animos pertinent important,
proximique sunt Germanis, qui trans Rhenum incolunt, quibuscum continenter
bellum gerunt. Qua de causa Helvetii quoque reliquos Gallos virtute
praecedunt, quod fere cotidianis proeliis cum Germanis contendunt, cum aut
suis finibus eos prohibent aut ipsi in eorum finibus bellum gerunt.
My Web feed
清单 7(atom1.xml)是引用的 Atom 文件。
清单 7. (atom1.xml)Atom 文件示例
xml:lang="en"
xml:base="http://www.example.org">
http://www.example.org/myfeed
My Simple Feed
2005-07-15T12:00:00Z
Uche Ogbuji
http://www.example.org/entries/1
A simple blog entry
2005-07-14T12:00:00Z
This is a simple blog entry
http://www.example.org/entries/2
2005-07-15T12:00:00Z
This is simple blog entry without a title
清单 8 是 home.js,包含了加载到工作台上的动态应用程序代码。清单 8. (home.js)主页 Web 提要显示的应用程序代码
/*
home.js
*/
var ATOM_NS = 'http://www.w3.org/2005/Atom';function xml_ready(result){
var xml = result.responseXML;
//Make sure the target area for inserting data is clear
$("#update-target").empty();
$(xml).find('*').ns_filter(ATOM_NS, 'entry').each(function(){
var title_elem = $(this).find('*').ns_filter(ATOM_NS, 'title').clone();
var link_text = $(this).find('[rel="alternate"]')
.ns_filter(ATOM_NS, 'link')
.attr('href');
var summary_elem = $(this).find('*').ns_filter(ATOM_NS, 'summary').clone();//Deal with the case of a missing title
if (!title_elem.text()){
title_elem = '[No title]';
}//Deal with the case where rel='alternate' is omitted
if (!link_text){
link_text = $(this).find('*')
.ns_filter(ATOM_NS, 'link')
.not('[rel]')
.attr('href');
}//Update the target area with the entry information
')
$('
.append(
$('')
.append(title_elem)
)
.append(' - ')
.append(summary_elem.clone())
.fadeIn('slow') //bonus animation
.appendTo('#update-target');
}); //close each(
}Again, I've commented this file, but there are a few points worth highlighting. Atom has many acceptable element variations, most of which are optional. This means you have to handle exceptions. I'll cite two common exceptions: optional rel="alternate" on a required link; and the fact that the title is optional. As you can see, jQuery provides tremendous flexibility in handling these situations, so you should be able to handle even this irregular XML format. In some cases I copied the structure directly from XML to the main document (hosted HTML). This requires great care, and you'll notice that I use the clone() method to make sure that I don't graft nodes from one document to another, otherwise a DOM exception WRONG_DOCUMENT_ERR will be emitted. Additionally, I used the jQuery method fadeIn so that the added content slowly disappears from view. Figure 2 is the browser display of home.html.
The above is how to use jQuery to process XML in the browser. For more related articles, please pay attention to the PHP Chinese website (www.php.cn)!