search
Homephp教程php手册PHP用SAX解析XML的实现代码与问题分析

近日在做一个解析XML的小程序,因为服务器是PHP4的,XML解析函数只能用SAX方式的xml_parser来解析了。

代码如下:
$g_books = array();
$g_elem = null;
function startElement( $parser, $name, $attrs )
{
global $g_books, $g_elem;
if ( $name == 'BOOK' ) $g_books []= array();
$g_elem = $name;
}
function endElement( $parser, $name )
{
global $g_elem;
$g_elem = null;
}
function textData( $parser, $text )
{
global $g_books, $g_elem;
if ( $g_elem == 'AUTHOR' ||
$g_elem == 'PUBLISHER' ||
$g_elem == 'TITLE' )
{
$g_books[ count( $g_books ) - 1 ][ $g_elem ] = $text;
}
}
$parser = xml_parser_create();
xml_set_element_handler( $parser, "startElement", "endElement" );
xml_set_character_data_handler( $parser, "textData" );
$f = fopen( 'books.xml', 'r' );
while( $data = fread( $f, 4096 ) )
{
xml_parse( $parser, $data );
}
xml_parser_free( $parser );
foreach( $g_books as $book )
{
echo $book['TITLE']." - ".$book['AUTHOR']." - ";
echo $book['PUBLISHER']."\n";
}
?>

PHP中用SAX方式解析XML发现的问题
XML如下:
so.xml
代码如下:



1047869
2008-08-28 14:54:51
红花还需绿叶扶--浅谈脚架云台的选购
很多专业摄影师在选购三脚架的时候,往往出手阔绰,3、4000元一个的捷信或者曼富图三脚架常常不用经过思考就买下来了,可是,他们却总是忽视了云台的精挑细眩其实,数码相机架在三脚架上面究竟稳不稳,起决定作用的是云台,那么我们如何才能挑选到一款稳如磐石的云台呢?云台家族种类繁多用途迥异简单的说,脚架云台是用于连接相机与脚架进行角度调节的部件,主要分成三维云台和球型云台。三维云台在横向旋转

...(省略若干行)


xml_class.php
代码如下:
class xml {
var $parser;
var $i =0;
var $search_result = array();
var $row = array();
var $data = array();
var $now_tag;
var $tags = array("ID", "CLASSID", "SUBCLASSID", "CLASSNAME", "TITLE", "SHORTTITLE", "AUTHOR", "PRODUCER", "SUMMARY", "CONTENT", "DATE");
function xml()
{
$this->parser = xml_parser_create();
xml_set_object($this->parser, $this);
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
}
function parse($data)
{
xml_parse($this->parser, $data);
}
function tag_open($parser, $tag, $attributes)
{
$this->now_tag=$tag;
if($tag=='RESULT') {
$this->search_result = $attributes;
}
if($tag=='ROW') {
$this->row[$this->i] = $attributes;
}
}
function cdata($parser, $cdata)
{
if(in_array($this->now_tag, $this->tags)){
$tagname = strtolower($this->now_tag);
$this->data[$this->i][$tagname] = $cdata;
}
}
function tag_close($parser, $tag)
{
$this->now_tag="";
if($tag=='ROW') {
$this->i++;
}
}
}
?>

search.php
代码如下:
require_once("./xml_class.php");
$xml = file_get_contents("./so.xml");
$xml_parser = new xml();
$xml_parser->parse($xml);
print_r($xml_parser);
?>

最后得到的结果中summary中的数据少了很多,总是得不到完整的summary内容。有时还会得到乱码,在网上也找了半天也不知道是什么问题引起的。
  后来才发现问题是因为xml_parser解析XML是循环处理节点中的数据的,每次只取大概300个字符长度(具体是多少,我也不太清楚,只是用strlen输出大概在300左右),于是才知道是因为每次的循环就会把前次的数据给复盖了,这样就会出现数据不全的问题。
  解决办法就是把xml_class文件中的xml类中的cdata方法中$this->data[$this->i][$tagname] = $cdata;改为$this->data[$this->i][$tagname] .= $cdata;即可解决(其中有一些NOTICE错误,PHP已忽略了).
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.