搜索
首页php教程php手册php字符串处理之全角半角转换,php字符串全角半角

php字符串处理之全角半角转换,php字符串全角半角

半角全角的处理是字符串处理的常见问题,本文尝试为大家提供一个思路。

一、概念

全角字符unicode编码从65281~65374 (十六进制 0xFF01 ~ 0xFF5E)
半角字符unicode编码从33~126 (十六进制 0x21~ 0x7E)
空格比较特殊,全角为 12288(0x3000),半角为 32 (0x20)
而且除空格外,全角/半角按unicode编码排序在顺序上是对应的
所以可以直接通过用+-法来处理非空格数据,对空格单独处理

二、实现思路

1. 找到目标unicode的字符,可以使用正则表达式解决

2. 修改unicode编码

三、实现

1. 首先是两个unicode与字符的转换函数:

<span> 1</span>     <span>/*</span><span>*
</span><span> 2</span> <span>     * 将unicode转换成字符
</span><span> 3</span> <span>     * @param int $unicode
</span><span> 4</span> <span>     * @return string UTF-8字符
</span><span> 5</span> <span>     *</span><span>*/</span>
<span> 6</span>     <span>function</span> unicode2Char(<span>$unicode</span><span>){
</span><span> 7</span>         <span>if</span>(<span>$unicode</span> < 128)     <span>return</span> <span>chr</span>(<span>$unicode</span><span>);
</span><span> 8</span>         <span>if</span>(<span>$unicode</span> < 2048)    <span>return</span> <span>chr</span>((<span>$unicode</span> >> 6) + 192) .
<span> 9</span>                                       <span>chr</span>((<span>$unicode</span> & 63) + 128<span>);
</span><span>10</span>         <span>if</span>(<span>$unicode</span> < 65536)   <span>return</span> <span>chr</span>((<span>$unicode</span> >> 12) + 224) .
<span>11</span>                                       <span>chr</span>(((<span>$unicode</span> >> 6) & 63) + 128) .
<span>12</span>                                       <span>chr</span>((<span>$unicode</span> & 63) + 128<span>);
</span><span>13</span>         <span>if</span>(<span>$unicode</span> < 2097152) <span>return</span> <span>chr</span>((<span>$unicode</span> >> 18) + 240) .
<span>14</span>                                       <span>chr</span>(((<span>$unicode</span> >> 12) & 63) + 128) .
<span>15</span>                                       <span>chr</span>(((<span>$unicode</span> >> 6) & 63) + 128) .
<span>16</span>                                       <span>chr</span>((<span>$unicode</span> & 63) + 128<span>);
</span><span>17</span>         <span>return</span> <span>false</span><span>;
</span><span>18</span> <span>    }
</span><span>19</span>  
<span>20</span>     <span>/*</span><span>*
</span><span>21</span> <span>     * 将字符转换成unicode
</span><span>22</span> <span>     * @param string $char 必须是UTF-8字符
</span><span>23</span> <span>     * @return int
</span><span>24</span> <span>     *</span><span>*/</span>
<span>25</span>     <span>function</span> char2Unicode(<span>$char</span><span>){
</span><span>26</span>         <span>switch</span> (<span>strlen</span>(<span>$char</span><span>)){
</span><span>27</span>             <span>case</span> 1 : <span>return</span> <span>ord</span>(<span>$char</span><span>);
</span><span>28</span>             <span>case</span> 2 : <span>return</span> (<span>ord</span>(<span>$char</span>{1}) & 63) |
<span>29</span>                             ((<span>ord</span>(<span>$char</span>{0}) & 31) << 6<span>);
</span><span>30</span>             <span>case</span> 3 : <span>return</span> (<span>ord</span>(<span>$char</span>{2}) & 63) |
<span>31</span>                             ((<span>ord</span>(<span>$char</span>{1}) & 63) << 6) |
<span>32</span>                             ((<span>ord</span>(<span>$char</span>{0}) & 15) << 12<span>);
</span><span>33</span>             <span>case</span> 4 : <span>return</span> (<span>ord</span>(<span>$char</span>{3}) & 63) |
<span>34</span>                             ((<span>ord</span>(<span>$char</span>{2}) & 63) << 6) |
<span>35</span>                             ((<span>ord</span>(<span>$char</span>{1}) & 63) << 12) |
<span>36</span>                             ((<span>ord</span>(<span>$char</span>{0}) & 7)  << 18<span>);
</span><span>37</span>             <span>default</span> :
<span>38</span>                 <span>trigger_error</span>('Character is not UTF-8!', <span>E_USER_WARNING</span><span>);
</span><span>39</span>                 <span>return</span> <span>false</span><span>;
</span><span>40</span> <span>        }
</span><span>41</span>     }

  2. 全角转半角

<span> 1</span>     <span>/*</span><span>*
</span><span> 2</span> <span>     * 全角转半角
</span><span> 3</span> <span>     * @param string $str
</span><span> 4</span> <span>     * @return string
</span><span> 5</span> <span>     *</span><span>*/</span>
<span> 6</span>     <span>function</span> sbc2Dbc(<span>$str</span><span>){
</span><span> 7</span>         <span>return</span> <span>preg_replace</span><span>(
</span><span> 8</span>             <span>//</span><span> 全角字符 </span>
<span> 9</span>             '/[\x{3000}\x{ff01}-\x{ff5f}]/ue',
<span>10</span>             <span>//</span><span> 编码转换
</span><span>11</span> <span>            // 0x3000是空格,特殊处理,其他全角字符编码-0xfee0即可以转为半角</span>
<span>12</span>             '($unicode=char2Unicode(\'\0\')) == 0x3000 ? " " : (($code=$unicode-0xfee0) > 256 ? unicode2Char($code) : chr($code))',
<span>13</span>             <span>$str</span>
<span>14</span> <span>        );
</span><span>15</span>     }

3. 半角转全角

<span> 1</span>     <span>/*</span><span>*
</span><span> 2</span> <span>     * 半角转全角
</span><span> 3</span> <span>     * @param string $str
</span><span> 4</span> <span>     * @return string
</span><span> 5</span> <span>     *</span><span>*/</span>
<span> 6</span>     <span>function</span> dbc2Sbc(<span>$str</span><span>){</span>
<span> 7</span>         <span>return</span> <span>preg_replace</span><span>(
</span><span> 8</span>             <span>//</span><span> 半角字符 </span>
<span> 9</span>             '/[\x{0020}\x{0020}-\x{7e}]/ue',  
<span>10</span>             <span>//</span><span> 编码转换
</span><span>11</span> <span>            // 0x0020是空格,特殊处理,其他半角字符编码+0xfee0即可以转为全角</span>
<span>12</span>             '($unicode=char2Unicode(\'\0\')) == 0x0020 ? unicode2Char(0x3000) : (($code=$unicode+0xfee0) > 256 ? unicode2Char($code) : chr($code))',
<span>13</span>             <span>$str</span>
<span>14</span> <span>        );
</span><span>15</span>     }

四、测试

 示例代码:

<span>1</span> <span>$a</span> = 'abc12 345'<span>;
</span><span>2</span> <span>$sbc</span> = dbc2Sbc(<span>$a</span><span>);
</span><span>3</span> <span>$dbc</span> = sbc2Dbc(<span>$sbc</span><span>);
</span><span>4</span> 
<span>5</span> <span>var_dump</span>(<span>$a</span>, <span>$sbc</span>, <span>$dbc</span>);

结果:

<span>1</span> <span>string</span>(9) "abc12 345"
<span>2</span> <span>string</span>(27) "abc12 345"
<span>3</span> <span>string</span>(9) "abc12 345"

 

声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover

AI Clothes Remover

用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool

Undress AI Tool

免费脱衣服图片

Clothoff.io

Clothoff.io

AI脱衣机

Video Face Swap

Video Face Swap

使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热工具

SecLists

SecLists

SecLists是最终安全测试人员的伙伴。它是一个包含各种类型列表的集合,这些列表在安全评估过程中经常使用,都在一个地方。SecLists通过方便地提供安全测试人员可能需要的所有列表,帮助提高安全测试的效率和生产力。列表类型包括用户名、密码、URL、模糊测试有效载荷、敏感数据模式、Web shell等等。测试人员只需将此存储库拉到新的测试机上,他就可以访问到所需的每种类型的列表。

Atom编辑器mac版下载

Atom编辑器mac版下载

最流行的的开源编辑器

EditPlus 中文破解版

EditPlus 中文破解版

体积小,语法高亮,不支持代码提示功能

PhpStorm Mac 版本

PhpStorm Mac 版本

最新(2018.2.1 )专业的PHP集成开发工具

DVWA

DVWA

Damn Vulnerable Web App (DVWA) 是一个PHP/MySQL的Web应用程序,非常容易受到攻击。它的主要目标是成为安全专业人员在合法环境中测试自己的技能和工具的辅助工具,帮助Web开发人员更好地理解保护Web应用程序的过程,并帮助教师/学生在课堂环境中教授/学习Web应用程序安全。DVWA的目标是通过简单直接的界面练习一些最常见的Web漏洞,难度各不相同。请注意,该软件中