php字符串处理之全角半角转换,php字符串全角半角
半角全角的处理是字符串处理的常见问题,本文尝试为大家提供一个思路。
一、概念
全角字符unicode编码从65281~65374 (十六进制 0xFF01 ~ 0xFF5E)
半角字符unicode编码从33~126 (十六进制 0x21~ 0x7E)
空格比较特殊,全角为 12288(0x3000),半角为 32 (0x20)
而且除空格外,全角/半角按unicode编码排序在顺序上是对应的
所以可以直接通过用+-法来处理非空格数据,对空格单独处理
二、实现思路
1. 找到目标unicode的字符,可以使用正则表达式解决
2. 修改unicode编码
三、实现
1. 首先是两个unicode与字符的转换函数:
<span> 1</span> <span>/*</span><span>* </span><span> 2</span> <span> * 将unicode转换成字符 </span><span> 3</span> <span> * @param int $unicode </span><span> 4</span> <span> * @return string UTF-8字符 </span><span> 5</span> <span> *</span><span>*/</span> <span> 6</span> <span>function</span> unicode2Char(<span>$unicode</span><span>){ </span><span> 7</span> <span>if</span>(<span>$unicode</span> < 128) <span>return</span> <span>chr</span>(<span>$unicode</span><span>); </span><span> 8</span> <span>if</span>(<span>$unicode</span> < 2048) <span>return</span> <span>chr</span>((<span>$unicode</span> >> 6) + 192) . <span> 9</span> <span>chr</span>((<span>$unicode</span> & 63) + 128<span>); </span><span>10</span> <span>if</span>(<span>$unicode</span> < 65536) <span>return</span> <span>chr</span>((<span>$unicode</span> >> 12) + 224) . <span>11</span> <span>chr</span>(((<span>$unicode</span> >> 6) & 63) + 128) . <span>12</span> <span>chr</span>((<span>$unicode</span> & 63) + 128<span>); </span><span>13</span> <span>if</span>(<span>$unicode</span> < 2097152) <span>return</span> <span>chr</span>((<span>$unicode</span> >> 18) + 240) . <span>14</span> <span>chr</span>(((<span>$unicode</span> >> 12) & 63) + 128) . <span>15</span> <span>chr</span>(((<span>$unicode</span> >> 6) & 63) + 128) . <span>16</span> <span>chr</span>((<span>$unicode</span> & 63) + 128<span>); </span><span>17</span> <span>return</span> <span>false</span><span>; </span><span>18</span> <span> } </span><span>19</span> <span>20</span> <span>/*</span><span>* </span><span>21</span> <span> * 将字符转换成unicode </span><span>22</span> <span> * @param string $char 必须是UTF-8字符 </span><span>23</span> <span> * @return int </span><span>24</span> <span> *</span><span>*/</span> <span>25</span> <span>function</span> char2Unicode(<span>$char</span><span>){ </span><span>26</span> <span>switch</span> (<span>strlen</span>(<span>$char</span><span>)){ </span><span>27</span> <span>case</span> 1 : <span>return</span> <span>ord</span>(<span>$char</span><span>); </span><span>28</span> <span>case</span> 2 : <span>return</span> (<span>ord</span>(<span>$char</span>{1}) & 63) | <span>29</span> ((<span>ord</span>(<span>$char</span>{0}) & 31) << 6<span>); </span><span>30</span> <span>case</span> 3 : <span>return</span> (<span>ord</span>(<span>$char</span>{2}) & 63) | <span>31</span> ((<span>ord</span>(<span>$char</span>{1}) & 63) << 6) | <span>32</span> ((<span>ord</span>(<span>$char</span>{0}) & 15) << 12<span>); </span><span>33</span> <span>case</span> 4 : <span>return</span> (<span>ord</span>(<span>$char</span>{3}) & 63) | <span>34</span> ((<span>ord</span>(<span>$char</span>{2}) & 63) << 6) | <span>35</span> ((<span>ord</span>(<span>$char</span>{1}) & 63) << 12) | <span>36</span> ((<span>ord</span>(<span>$char</span>{0}) & 7) << 18<span>); </span><span>37</span> <span>default</span> : <span>38</span> <span>trigger_error</span>('Character is not UTF-8!', <span>E_USER_WARNING</span><span>); </span><span>39</span> <span>return</span> <span>false</span><span>; </span><span>40</span> <span> } </span><span>41</span> }
2. 全角转半角
<span> 1</span> <span>/*</span><span>* </span><span> 2</span> <span> * 全角转半角 </span><span> 3</span> <span> * @param string $str </span><span> 4</span> <span> * @return string </span><span> 5</span> <span> *</span><span>*/</span> <span> 6</span> <span>function</span> sbc2Dbc(<span>$str</span><span>){ </span><span> 7</span> <span>return</span> <span>preg_replace</span><span>( </span><span> 8</span> <span>//</span><span> 全角字符 </span> <span> 9</span> '/[\x{3000}\x{ff01}-\x{ff5f}]/ue', <span>10</span> <span>//</span><span> 编码转换 </span><span>11</span> <span> // 0x3000是空格,特殊处理,其他全角字符编码-0xfee0即可以转为半角</span> <span>12</span> '($unicode=char2Unicode(\'\0\')) == 0x3000 ? " " : (($code=$unicode-0xfee0) > 256 ? unicode2Char($code) : chr($code))', <span>13</span> <span>$str</span> <span>14</span> <span> ); </span><span>15</span> }
3. 半角转全角
<span> 1</span> <span>/*</span><span>* </span><span> 2</span> <span> * 半角转全角 </span><span> 3</span> <span> * @param string $str </span><span> 4</span> <span> * @return string </span><span> 5</span> <span> *</span><span>*/</span> <span> 6</span> <span>function</span> dbc2Sbc(<span>$str</span><span>){</span> <span> 7</span> <span>return</span> <span>preg_replace</span><span>( </span><span> 8</span> <span>//</span><span> 半角字符 </span> <span> 9</span> '/[\x{0020}\x{0020}-\x{7e}]/ue', <span>10</span> <span>//</span><span> 编码转换 </span><span>11</span> <span> // 0x0020是空格,特殊处理,其他半角字符编码+0xfee0即可以转为全角</span> <span>12</span> '($unicode=char2Unicode(\'\0\')) == 0x0020 ? unicode2Char(0x3000) : (($code=$unicode+0xfee0) > 256 ? unicode2Char($code) : chr($code))', <span>13</span> <span>$str</span> <span>14</span> <span> ); </span><span>15</span> }
四、测试
示例代码:
<span>1</span> <span>$a</span> = 'abc12 345'<span>; </span><span>2</span> <span>$sbc</span> = dbc2Sbc(<span>$a</span><span>); </span><span>3</span> <span>$dbc</span> = sbc2Dbc(<span>$sbc</span><span>); </span><span>4</span> <span>5</span> <span>var_dump</span>(<span>$a</span>, <span>$sbc</span>, <span>$dbc</span>);
结果:
<span>1</span> <span>string</span>(9) "abc12 345" <span>2</span> <span>string</span>(27) "abc12 345" <span>3</span> <span>string</span>(9) "abc12 345"

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Zend Studio 13.0.1
Powerful PHP integrated development environment

Notepad++7.3.1
Easy-to-use and free code editor

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.