Home  >  Article  >  Backend Development  >  How to use iconv function in php_PHP tutorial

How to use iconv function in php_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:52:03768browse

The iconv function library can complete conversions between various character sets and is an indispensable basic function library in PHP programming.
1. Download the libiconv function library http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.9.2.tar.gz;
2. Unzip tar -zxvf libiconv-1.9.2 .tar.gz;
3. Install libiconv
#configure --prefix=/usr/local/iconv
#make
#make install
4. Recompile php and add compilation parameters - -with-iconv=/usr/local/iconv

under windows

I am currently working on a thief program and need to use the iconv function to capture the utf -8 encoded pages were converted to gb2312, and I found that if I used the iconv function to transcode the captured data, the data would be less for no reason. It made me depressed for a while. After checking the information on the Internet, I found out that this was a bug in the iconv function. iconv will make an error when converting the character "—" to gb2312
The solution is very simple, that is, add "//IGNORE" after the encoding that needs to be converted, which is the second parameter of the iconv function. As follows:

The following is the quoted content:

Copy code The code is as follows:

iconv("UTF-8","GB2312 //IGNORE",$data)

ignore means to ignore errors during conversion. Without the ignore parameter, all strings following this character cannot be saved.
Copy code The code is as follows:

echo $str= 'Hello, this is for sale Coffee!';
echo '
';
echo iconv('GB2312', 'UTF-8', $str); //Convert the string encoding from GB2312 to UTF- 8
echo '
';
echo iconv_substr($str, 1, 1, 'UTF-8'); //Truncate by the number of characters instead of bytes
print_r( iconv_get_encoding()); //Get the current page encoding information
echo iconv_strlen($str, 'UTF-8'); //Get the string length of the set encoding
//This is also used
$content = iconv("UTF-8","gbk//TRANSLIT",$content);
?>

iconv is not the default function of PHP, and it is also a module installed by default. It needs to be installed before it can be used.
If it is windows2000+php, you can modify the php.ini file and remove the ";" before extension=php_iconv.dll. At the same time, you need to copy the iconv.dll in your original php installation file to your winnt/ Under system32 (if your dll points to this directory)
In the Linux environment, use static installation and add an additional item --with-iconv when configure. phpinfo can see the iconv item. (Linux7.3+Apache4.06+php4.3.2),

Download: ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz
Installation:
#cp libiconv-1.8.tar.gz /usr/local/src
#tar zxvf lib*
#./configure --prefix=/usr/local/libiconv
#make
#make install
Compile php
#./configure --prefix=/usr/local/php4.3.2 --with-iconv=/usr/local/libiconv/
Easy to use Example:

echo iconv("gb2312","ISO-8859-1","we");
?>

Introduction to the mb_convert_encoding and iconv functions in PHP

The mb_convert_encoding function is used to convert encodings. I used to not understand the concept of program coding, but now I seem to understand a little bit.
However, English generally does not have encoding problems, only Chinese data will have this problem. For example, when you use Zend Studio or Editplus to write a program, you use gbk encoding. If the data needs to be entered into the database, and the database encoding is utf8, then the data must be encoded and converted, otherwise it will become garbled when entering the database. .

See the official usage of mb_convert_encoding:
http://cn.php.net/manual/zh/function.mb-convert-encoding.php

Make a GBK To UTF- 8
< ?php
header("content-Type: text/html; charset=Utf-8");
echo mb_convert_encoding("You are my friend", "UTF-8" , "GBK");
?>

Another GB2312 To Big5
< ?php
header("content-Type: text/html; charset=big5");
echo mb_convert_encoding("You are my friend", "big5", "GB2312");
?>
However, to use the above function, you need to install it but you need to enable the mbstring extension library first.

Another function iconv in PHP is also used to convert string encoding, and its function is similar to the function above.

There are some detailed examples below:
iconv — Convert string to requested character encoding
(PHP 4 >= 4.0.5, PHP 5)
mb_convert_encoding — Convert character encoding
(PHP 4 >= 4.0.6, PHP 5)

Usage:
string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )
You need to enable the mbstring extension library first, In php.ini, remove the ; in front of; extension=php_mbstring.dll.
mb_convert_encoding can specify multiple input encodings. It will automatically identify according to the content, but the execution efficiency is much worse than iconv;


string iconv (string in_charset, string out_charset, string str)
Note: In addition to specifying the encoding to be converted to, the second parameter can also add two suffixes: //TRANSLIT and //IGNORE, where //TRANSLIT will automatically convert characters that cannot be converted directly into one or more approximate characters. //IGNORE will ignore characters that cannot be converted, and the default effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure.


Use:

It is found that iconv will make an error when converting the character "-" to gb2312. If there is no ignore parameter, all this The string following the character cannot be saved. No matter what, this "-" cannot be converted successfully and cannot be output.In addition, mb_convert_encoding does not have this bug.

In general, iconv is used. The mb_convert_encoding function is only used when the original encoding cannot be determined, or the iconv conversion cannot be displayed normally.

from_encoding is specified by character code name before conversion. it can be array or string - comma separated enumerated list. If it is not specified, the internal encoding will be used.
/* Auto detect encoding from JIS, eucjp-win, sjis-win, then convert str to UCS-2LE */
$str = mb_convert_encoding($str, “UCS-2LE”, “JIS, eucjp-win, sjis-win”);
/* “auto ” is expanded to “ASCII,JIS,UTF-8,EUC-JP,SJIS” */
$str = mb_convert_encoding($str, “EUC-JP”, “auto”);

Example :
$content = iconv(”GBK”, “UTF-8″, $content);
$content = mb_convert_encoding($content, “UTF-8″,”GBK”);

Parameters that are easily overlooked when using the iconv function in php
When I was processing the captured content today, when using iconv for encoding conversion, I found that the result would be interrupted. I guess it was a problem with the character set. I was thinking about how to skip characters that do not exist in the target character set. I checked the manual and found that the iconv function only has three parameters, which seemed not to work. Then I checked online and someone said it could, but I was very surprised how to implement it. Finally, I found that the English description said that you can add labels to the target. Behind the code: "TRANSLIT", I am very depressed, how to add it? It turns out that "//" is added first, which is really depressing. There is such a design
Prototype: $txtContent = iconv("utf-8",'GBK',$txtContent);

Special parameters: iconv("UTF-8","GB2312//IGNORE",$data)


Two optional auxiliary parameters: TRANSLIT and IGNORE, (where IGNORE means that if you encounter something that cannot be converted, jump over). Description

string iconv ( string in_charset, string out_charset, string str )

Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure.

If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can't be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/319000.htmlTechArticleiconv function library can complete conversion between various character sets and is an indispensable basic function library in PHP programming. 1. Download the libiconv function library http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.9....
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn