Home > Article > Backend Development > How to use iconv function in php, phpicv function_PHP tutorial
iconv function library can complete the conversion between various character sets and is an indispensable basic function library in php programming.
1. Download the libiconv function library http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.9.2.tar.gz;
2. Unzip tar -zxvf libiconv-1.9.2 .tar.gz;
3. Install libiconv
#configure --prefix=/usr/local/iconv
-with-iconv=/usr/local/iconv
under windows
I am currently working on a thief program and need to use the iconv function to capture the utf -8 encoded pages were converted to gb2312, and I found that if I used the iconv function to transcode the captured data, the data would be less for no reason. It made me depressed for a while. After checking the information on the Internet, I found out that this was a bug in the iconv function. iconv will make an error when converting the character "—" to gb2312 The solution is very simple, that is, add "//IGNORE" after the encoding that needs to be converted, which is the second parameter of the iconv function. As follows:
The following is the quoted content:
Copy code
Copy codeThe code is as follows:
echo $str= 'Hello, we sell coffee here!';
echo '
';
echo iconv('GB2312', 'UTF -8', $str); //Convert the string encoding from GB2312 to UTF-8
echo '
';
echo iconv_substr($str, 1, 1, 'UTF -8'); //Truncate by the number of characters instead of bytes
print_r(iconv_get_encoding()); //Get the current page encoding information
echo iconv_strlen($str, 'UTF-8'); / /Get the string length of the set encoding
//It can also be used like this
$content = iconv("UTF-8","gbk//TRANSLIT",$content);
?>
iconv is not the default function of PHP, and it is also a module installed by default. It needs to be installed before it can be used.
If it is Windows 2000 PHP, you can modify the php.ini file and remove the ";" before extension=php_iconv.dll. At the same time, you need to copy the iconv.dll in your original PHP installation file to your winnt/system32 (If your dll points to this directory)
In the Linux environment, use static installation and add an additional item --with-iconv when configure. phpinfo can see the iconv item. (Linux7.3 Apache4.06 php4.3.2),
Download: ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz
Installation:
#cp libiconv-1.8.tar.gz /usr/local/src
#tar zxvf lib*
#./configure --prefix=/usr/local/libiconv
#make
# make install
Compile php
#./configure --prefix=/usr/local/php4.3.2 --with-iconv=/usr/local/libiconv/
A simple example of use:
echo iconv("gb2312","ISO-8859-1","we");
?>
Introduction to mb_convert_encoding and iconv functions in PHP
The mb_convert_encoding function is used to convert encodings. I used to not understand the concept of program coding, but now I seem to understand a little bit.
However, English generally does not have encoding problems, only Chinese data will have this problem. For example, when you use Zend Studio or Editplus to write a program, you use gbk encoding. If the data needs to be entered into the database, and the database encoding is utf8, then the data must be encoded and converted, otherwise it will become garbled when entering the database. .
See the official usage of mb_convert_encoding:
http://cn.php.net/manual/zh/function.mb-convert-encoding.php
Make a GBK To UTF- 8
< ?php
header("content-Type: text/html; charset=Utf-8");
echo mb_convert_encoding("You are my friend", "UTF-8" , "GBK");
?>
Another GB2312 To Big5
< ?php
header("content-Type: text/html; charset=big5");
echo mb_convert_encoding("You are my friend", "big5", "GB2312");
?>
However, to use the above function, you need to install it but you need to enable the mbstring extension library first.
Another function iconv in PHP is also used to convert string encoding, and its function is similar to the function above.
There are some detailed examples below:
iconv — Convert string to requested character encoding
(PHP 4 >= 4.0.5, PHP 5)
mb_convert_encoding — Convert character encoding
(PHP 4 >= 4.0.6, PHP 5)
Usage:
string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )
You need to enable the mbstring extension library first. In php.ini, remove the ; in front of extension=php_mbstring.dll
mb_convert_encoding can specify multiple input encodings. It will automatically identify according to the content, but the execution efficiency is much worse than iconv;
string iconv (string in_charset, string out_charset, string str)
Note: In addition to specifying the encoding to be converted to, the second parameter can also add two suffixes: //TRANSLIT and //IGNORE, where //TRANSLIT will automatically convert characters that cannot be converted directly into one or more approximate characters. //IGNORE will ignore characters that cannot be converted, and the default effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure.
Use:
It is found that iconv will make an error when converting the character "-" to gb2312. If there is no ignore parameter, all this The string following the character cannot be saved. No matter what, this "—" cannot be converted successfully and cannot be output. In addition, mb_convert_encoding does not have this bug.
In general, iconv is used. The mb_convert_encoding function is only used when the original encoding cannot be determined, or the iconv conversion cannot be displayed normally.
from_encoding is specified by character code name before conversion. it can be array or string - comma separated enumerated list. If it is not specified, the internal encoding will be used.
/* Auto detect encoding from JIS, eucjp-win, sjis-win, then convert str to UCS-2LE */
$str = mb_convert_encoding($str, “UCS-2LE”, “JIS, eucjp-win, sjis -win”);
/* “auto” is expanded to “ASCII,JIS,UTF-8,EUC-JP,SJIS” */
$str = mb_convert_encoding($str, “EUC-JP”, "auto");
Example:
$content = iconv("GBK", "UTF-8", $content);
$content = mb_convert_encoding($content, "UTF-8" ″,"GBK");
Parameters that are easily overlooked when using the iconv function in php
When processing the captured content today, when using iconv for encoding conversion, I found that the results would be interrupted. I guessed it was a problem with the character set. I thought about how to skip characters that did not exist in the target character set. I checked the manual and found that the iconv function only has three parameters, which seemed not to work. Then I checked on the Internet and someone said it could, but it was strange how to implement it. , and finally found that the English description said that you can add a mark to the end of the target code: "TRANSLIT", I was very depressed, how to add it? It turns out that "//" is added first, which is really depressing. There is such a design
Prototype: $txtContent = iconv("utf-8",'GBK',$txtContent);
Special parameters: iconv("UTF-8","GB2312//IGNORE",$data)
Two optional auxiliary parameters: TRANSLIT and IGNORE, (where IGNORE means that if you encounter something that cannot be converted, jump over). Description
string iconv ( string in_charset, string out_charset, string str )
Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure.
If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can't be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character.