Home >Backend Development >PHP Tutorial >Some methods of parsing XML with PHP_PHP tutorial

Some methods of parsing XML with PHP_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:33:271005browse

First of all, let’s talk about the encoding issue. If the encoding of the XML file and the page file are inconsistent, garbled characters will be generated. To solve the problem of Chinese garbled characters, you can use the following statement when outputting: echo iconv("UTF-8","GBK",$Song_Url);

Encoding of PHP web pages

The encoding of the php file itself should match the encoding of the web page. If you want to use gb2312 encoding, then php should output the header: header("Content-Type: text/html; charset=gb2312"), and add , the encoding format of all files is ANSI, you can open it with Notepad, save as, select the encoding as ANSI, and overwrite the source file.

If you want to use utf-8 encoding, then php should output the header: header("Content-Type: text/html; charset=utf-8"), and add , the encoding format of all files is utf-8. Saving as utf-8 may be a bit troublesome. Generally, utf-8 files will have BOM at the beginning. If you use session, there will be problems. You can use editplus to save. In editplus, go to Tools->Parameter Selection->File->UTF-8 Sign, select Always delete, and then save to remove the BOM information.

php itself is not Unicode, all functions such as substr must be changed to mb_substr (mbstring extension needs to be installed); or use iconv to transcode.

Data interaction between PHP and Mysql The coding of PHP and database should be consistent

Modify the mysql configuration file my.ini or my.cnf. It is best to use utf8 encoding for mysql [mysql]

default-character-set=utf8
[mysqld]
default-character-set=utf8
default-storage-engine=MyISAM
在[mysqld]下加入:
default-collation=utf8_bin
init_connect='SET NAMES utf8'

Add mysql_query("set names 'encoding'"); before the PHP program that needs to perform database operations. The encoding is consistent with the PHP encoding. If the PHP encoding is gb2312, then the mysql encoding is gb2312. If it is utf-8, then the mysql encoding is It is utf8, so that there will be no garbled characters when inserting or retrieving data.

PHP is related to the operating system

The encoding of Windows and Linux is different. In Windows environment, when calling PHP functions, if the parameters are encoded in utf-8, errors will occur, such as move_uploaded_file(), filesize(), readfile(), etc. These functions It is often used when processing uploads and downloads. The following error may occur when calling:

Warning: move_uploaded_file()[function.move-uploaded-file]:failed to open stream: Invalid argument in ...
Warning: move_uploaded_file()[function.move-uploaded-file]:Unable to move '' to '' in ...
Warning: filesize() [function.filesize]: stat failed for ... in ...
Warning: readfile() [function.readfile]: failed to open stream: Invalid argument in ..

Although these errors will not occur when using gb2312 encoding in a Linux environment, the saved file name will be garbled and the file cannot be read. In this case, the parameters can be converted to the encoding recognized by the operating system. The encoding conversion can be done with mb_convert_encoding( String, new encoding, original encoding) or iconv (original encoding, new encoding, string), so that the file name saved after processing will not be garbled, and the file can also be read normally, enabling uploading and downloading of Chinese name files. .

In fact, there is a better solution, completely disconnected from the system, and there is no need to consider the encoding of the system. You can generate a sequence of only letters and numbers as the file name, and save the original name with Chinese characters in the database. In this way, there will be no problem when calling move_uploaded_file(). When downloading, you only need to change the file name to the original name with Chinese characters. Chinese name. The code to implement downloading is as follows:

header("Pragma: public");
header("Expires: 0");
header("Cache-Component: must-revalidate, post-check=0, pre-check=0");
header("Content-type: $file_type");
header("Content-Length: $file_size");
header("Content-Disposition: attachment; filename="$file_name"");
header("Content-Transfer-Encoding: binary");
readfile($file_path);  

$file_type is the type of file, $file_name is the original name, and $file_path is the address of the file saved on the service.

book.xml

<books>
	<book>
		<author>Jack Herrington</author>
		<title>PHP Hacks</title>
		<publisher>O'Reilly</publisher>
	</book>
	<book>
		<author>Jack Herrington</author>
		<title>Podcasting Hacks</title>
		<publisher>O'Reilly</publisher>
	</book>
</books>

Read XML using DOM library:

<?php
$doc = new DOMDocument();
$doc->load( 'books.xml' );
$books = $doc->getElementsByTagName( "book" );
foreach( $books as $book )
{
	$authors = $book->getElementsByTagName( "author" );
	$author = $authors->item(0)->nodeValue;
	$publishers = $book->getElementsByTagName( "publisher" );
	$publisher = $publishers->item(0)->nodeValue;
	$titles = $book->getElementsByTagName( "title" );
	$title = $titles->item(0)->nodeValue;
	echo "$title - $author - $publishern";
}
?>

Read XML with SAX parser:

<?php
$g_books = array();
$g_elem = null;
function startElement( $parser, $name, $attrs ) 
{
global $g_books, $g_elem;
if ( $name == 'BOOK' ) $g_books []= array();
$g_elem = $name;
}
function endElement( $parser, $name ) 
{
global $g_elem;
$g_elem = null;
}
function textData( $parser, $text )
{
global $g_books, $g_elem;
if ( $g_elem == 'AUTHOR' ||
$g_elem == 'PUBLISHER' ||
$g_elem == 'TITLE' )
{
$g_books[ count( $g_books ) - 1 ][ $g_elem ] = $text;
}
}
$parser = xml_parser_create();
xml_set_element_handler( $parser, "startElement", "endElement" );
xml_set_character_data_handler( $parser, "textData" );
$f = fopen( 'books.xml', 'r' );
while( $data = fread( $f, 4096 ) )
{
xml_parse( $parser, $data );
}
xml_parser_free( $parser );
foreach( $g_books as $book )
{
echo $book['TITLE']." - ".$book['AUTHOR']." - ";
echo $book['PUBLISHER']."n";
}
?>

Parse XML with regular expressions:

<?php
$xml = "";
$f = fopen( 'books.xml', 'r' );
while( $data = fread( $f, 4096 ) ) { $xml .= $data; }
fclose( $f );
preg_match_all( "/<book>(.*?)</book>/s", 
$xml, $bookblocks );
foreach( $bookblocks[1] as $block )
{
preg_match_all( "/<author>(.*?)</author>/", 
$block, $author );
preg_match_all( "/<title>(.*?)</title>/", 
$block, $title );
preg_match_all( "/<publisher>(.*?)</publisher>/", 
$block, $publisher );
echo( $title[1][0]." - ".$author[1][0]." - ".
$publisher[1][0]."n" );
}
?>

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/752474.htmlTechArticleFirst of all, let’s talk about the encoding problem. If the encoding of the XML file and the page file are inconsistent, garbled characters will be generated. To solve the problem of Chinese garbled characters, you can use the following statement when outputting: echo iconv("UTF-8","G...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn