The content in this article introduces how to use the iconv function in php. I will share it with you here. Friends in need can refer to it
I am working on a program recently and need to use the iconv function. Convert the captured utf-8 encoded page into gb2312. I found that if I use the iconv function to transcode the captured data, the data will be less for no reason.
iconv function library can complete conversion between various character sets and is an indispensable basic function library in PHP programming.
1. Download the libiconv function library http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.9.2.tar.gz;
2. Unzip tar -zxvf libiconv-1.9.2 .tar.gz;
3. Install libiconv
#configure --prefix=/usr/local/iconv
#make
#make install
4. Recompile php and add compilation parameters - -with-iconv=/usr/local/iconv
under windows
I am currently working on a thief program, and I need to use the iconv function to capture the utf -8 encoded pages were converted to gb2312, and I found that if I used the iconv function to transcode the captured data, the data would be less for no reason. It made me depressed for a while. After checking the information on the Internet, I found out that this was a bug in the iconv function. iconv will make an error when converting the character "—" to gb2312
The solution is very simple, that is, add "//IGNORE" after the encoding that needs to be converted, which is the second parameter of the iconv function. As follows:
The following is the quoted content:
Copy code The code is as follows:
iconv("UTF-8","GB2312//IGNORE",$data)
ignore means ignoring errors during conversion, if not ignore parameter, all strings following this character cannot be saved.
Copy code The code is as follows:
<?php echo $str= '你好,这里是卖咖啡!'; echo '<br />'; echo iconv('GB2312', 'UTF-8', $str); //将字符串的编码从GB2312转到UTF-8 echo '<br />'; echo iconv_substr($str, 1, 1, 'UTF-8'); //按字符个数截取而非字节 print_r(iconv_get_encoding()); //得到当前页面编码信息 echo iconv_strlen($str, 'UTF-8'); //得到设定编码的字符串长度 //也有这样用的 $content = iconv("UTF-8","gbk//TRANSLIT",$content); ?>
iconv is not the default function of php, and it is also a module installed by default. It needs to be installed before it can be used.
If it is windows2000 php, you can modify the php.ini file and remove the ";" before extension=php_iconv.dll. At the same time, you need to copy the iconv.dll in your original php installation file to your winnt/system32 (If your dll points to this directory)
In the Linux environment, use static installation and add an additional item --with-iconv when configure. phpinfo can see the iconv item. (Linux7.3 Apache4.06 php4.3.2),
Download: ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz
Installation:
#cp libiconv-1.8.tar.gz /usr/local/src
#tar zxvf lib*
#./configure --prefix=/usr/local/libiconv
#make
# make install
Compile php
#./configure --prefix=/usr/local/php4.3.2 --with-iconv=/usr/local/libiconv/
Simple example of use:
<?php echo iconv("gb2312","ISO-8859-1","我们"); ?>
Introduction to mb_convert_encoding and iconv functions in PHP
mb_convert_encoding This function is used to convert encoding. I used to not understand the concept of program coding, but now I seem to understand a little bit.
However, English generally does not have encoding problems, only Chinese data will have this problem. For example, when you use Zend Studio or Editplus to write a program, you use gbk encoding. If the data needs to be entered into the database, and the database encoding is utf8, then the data must be encoded and converted, otherwise it will become garbled when entering the database. .
See the official usage of mb_convert_encoding:
http://cn.php.net/manual/zh/function.mb-convert-encoding.php
Make a GBK To UTF- 8
< ?php header("content-Type: text/html; charset=Utf-8"); echo mb_convert_encoding("妳係我的友仔", "UTF-8", "GBK"); ?>
Another GB2312 To Big5
< ?php header("content-Type: text/html; charset=big5"); echo mb_convert_encoding("你是我的朋友", "big5", "GB2312"); ?>
But to use the above function, you need to install it but you need to enable the mbstring extension library first.
Another function iconv in PHP is also used to convert string encoding, and has similar functions to the above function.
There are some detailed examples below:
iconv — Convert string to requested character encoding (PHP 4 >= 4.0.5, PHP 5) mb_convert_encoding — Convert character encoding (PHP 4 >= 4.0.6, PHP 5)
用法:
string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )
需要先enable mbstring 扩展库,在 php.ini里将; extension=php_mbstring.dll 前面的 ; 去掉
mb_convert_encoding 可以指定多种输入编码,它会根据内容自动识别,但是执行效率比iconv差太多;
string iconv ( string in_charset, string out_charset, string str )
注意:第二个参数,除了可以指定要转化到的编码以外,还可以增加两个后缀://TRANSLIT 和 //IGNORE,其中 //TRANSLIT 会自动将不能直接转化的字符变成一个或多个近似的字符,//IGNORE 会忽略掉不能转化的字符,而默认效果是从第一个非法字符截断。
Returns the converted string or FALSE on failure.
使用:
发现iconv在转换字符”—”到gb2312时会出错,如果没有ignore参数,所有该字符后面的字符串都无法被保存。不管怎么样,这个”—”都无法转换成功,无法输出。 另外mb_convert_encoding没有这个bug.
一般情况下用 iconv,只有当遇到无法确定原编码是何种编码,或者iconv转化后无法正常显示时才用mb_convert_encoding 函数.
from_encoding is specified by character code name before conversion. it can be array or string - comma separated enumerated list. If it is not specified, the internal encoding will be used.
/* Auto detect encoding from JIS, eucjp-win, sjis-win, then convert str to UCS-2LE */
$str = mb_convert_encoding($str, “UCS-2LE”, “JIS, eucjp-win, sjis-win”);
/* “auto” is expanded to “ASCII,JIS,UTF-8,EUC-JP,SJIS” */
$str = mb_convert_encoding($str, “EUC-JP”, “auto”);
例子:
$content = iconv(”GBK”, “UTF-8″, $content); $content = mb_convert_encoding($content, "UTF-8″,"GBK");
php中使用iconv函数时容易忽略的参数
今天在处理抓取内容的时候,当采用iconv进行编码转换的时候,发现结果会中断,猜是字符集的问题,考虑怎么跳过目标字符集不存在的字符,查手册发现iconv的函数只有三个参数,好像不行,然后查网上有人说可以,但是很奇怪怎么实现,最后发现英文描述有说可以加标识到目标编码后面:“TRANSLIT”,很郁闷怎么加呢?原来是先加“//”,真是郁闷,竟然有这样的设计
原型: $txtContent = iconv("utf-8",'GBK',$txtContent);
特殊参数:iconv("UTF-8","GB2312//IGNORE",$data)
两个可选的辅助参数:TRANSLIT和IGNORE ,(其中IGNORE 就是说遇到无法转换的就跳过)。 Description
string iconv ( string in_charset, string out_charset, string str )
Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure.
If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can't be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character.
相关推荐:
php 通过iconv将字符串从GBK转换为UTF8字符集的方法
The above is the detailed content of How to use iconv function in php. For more information, please follow other related articles on the PHP Chinese website!

PHP的Intl扩展是一个非常实用的工具,它提供了一系列国际化和本地化的功能。本文将介绍如何使用PHP的Intl扩展。一、安装Intl扩展在开始使用Intl扩展之前,需要安装该扩展。在Windows下,可以在php.ini文件中打开该扩展。在Linux下,可以通过命令行安装:Ubuntu/Debian:sudoapt-getinstallphp7.4-

CakePHP是一个开源的PHPMVC框架,它广泛用于Web应用程序的开发。CakePHP具有许多功能和工具,其中包括一个强大的数据库查询构造器,用于交互性能数据库。该查询构造器允许您使用面向对象的语法执行SQL查询,而不必编写繁琐的SQL语句。本文将介绍如何使用CakePHP中的数据库查询构造器。建立数据库连接在使用数据库查询构造器之前,您首先需要在Ca

随着网络技术的发展,PHP已经成为了Web开发的重要工具之一。而其中一款流行的PHP框架——CodeIgniter(以下简称CI)也得到了越来越多的关注和使用。今天,我们就来看看如何使用CI框架。一、安装CI框架首先,我们需要下载CI框架并安装。在CI的官网(https://codeigniter.com/)上下载最新版本的CI框架压缩包。下载完成后,解压缩

在进行文本处理过程中,对于不同编码格式的字符串进行转换是常见的需求。而PHP语言中提供的iconv(InternationalizationConvertion)函数可以非常方便地满足这一需求。本文将从以下几个方面详细介绍iconv函数的使用方法:iconv函数的定义和常见参数介绍实例演示:将GBK编码的字符串转换为UTF-8编码的字符串实例演示:将UTF

PHP是一种非常受欢迎的编程语言,它允许开发者创建各种各样的应用程序。但是,有时候在编写PHP代码时,我们需要处理和验证字符。这时候PHP的Ctype扩展就可以派上用场了。本文将就如何使用PHP的Ctype扩展展开介绍。什么是Ctype扩展?PHP的Ctype扩展是一个非常有用的工具,它提供了各种函数来验证字符串中的字符类型。这些函数包括isalnum、is

作为一种流行的前端框架,Vue能够提供开发者一个便捷高效的开发体验。其中,单文件组件是Vue的一个重要概念,使用它能够帮助开发者快速构建整洁、模块化的应用程序。在本文中,我们将介绍单文件组件是什么,以及如何在Vue中使用它们。一、单文件组件是什么?单文件组件(SingleFileComponent,简称SFC)是Vue中的一个重要概念,它

PHP是一种广泛使用的服务器端脚本语言,而CodeIgniter4(CI4)是一个流行的PHP框架,它提供了一种快速而优秀的方法来构建Web应用程序。在这篇文章中,我们将通过引导您了解如何使用CI4框架,来使您开始使用此框架来开发出众的Web应用程序。1.下载并安装CI4首先,您需要从官方网站(https://codeigniter.com/downloa

PHP是一门广泛应用于Web开发的编程语言,支持许多网络编程应用。其中,Socket编程是一种常用的实现网络通讯的方式,它能够让程序实现进程间的通讯,通过网络传输数据。本文将介绍如何在PHP中使用Socket编程功能。一、Socket编程简介Socket(套接字)是一种抽象的概念,在网络通信中代表了一个开放的端口,一个进程需要连接到该端口,才能与其它进程进行


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SublimeText3 Linux new version
SublimeText3 Linux latest version

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version
Useful JavaScript development tools

SublimeText3 English version
Recommended: Win version, supports code prompts!
