Home >Web Front-end >JS Tutorial >Implementation method of using gb2312 encoding and decoding under js

Implementation method of using gb2312 encoding and decoding under js

高洛峰
高洛峰Original
2017-02-04 09:46:422210browse

Requirements
Encode Chinese with gb2312 in js. For example, "I" should be "%CE%D2" after encoding.

Analysis
As we all know, encodeURI and encodeURIComponent will be encoded in utf-8. For example, after "I" is encoded, it is "%E6%88%91". According to experiments, it seems that there is no parameter to specify the encoding somewhere. Just find another way.
A rough analysis has the following solutions:
1. Use js to create a hidden iframe and specify it as gb2312 encoding, put the text that needs to be converted into an input of the iframe's form, and specify the form as Get method and submit,
Then get its url and parse it, you should be able to get its gb2312 encoded text.
2. Use ajax to send it to the server for encoding, and then send it back.
3. Create a gb2312 encoding table in js.

Implementation
The first solution feels too fiddly and needs to be tested in multiple different browsers.
The second option requires server cooperation.
The following is the implementation of the third solution:
At first, we planned to use an array to store the encoding table. Later, in order to reduce the size of the js file, we switched to string storage.
So, the js code is as follows:

Code

function encodeToGb2312(str){ 
var strOut=""; 
for(var i = 0; i < str.length; i++){ 
var c = str.charAt(i); 
var code = str.charCodeAt(i); 
if(c==" ") strOut +="+"; 
else if(code >= 19968 && code <= 40869){ 
index = code - 19968; 
strOut += "%" + z.substr(index*4,2) + "%" + z.substr(index*4+2,2); 
} 
else{ 
strOut += "%" + str.charCodeAt(i).toString(16); 
} 
} 
return strOut; 
} 
function decodeFromGb2312(str){ 
var strOut = &#39;&#39;; 
for (var i=0;i<str.length; i++){ 
var c = str.charAt(i); 
// +是空格 
if (c == &#39;+&#39;){ 
strOut += &#39; &#39;; 
} 
// a,b,c,1,2等,非%开头的,直接返回本身 
else if (c != &#39;%&#39;){ 
strOut += c; 
} 
// %开头 
else{ 
i++; 
var nextC = str.charAt(i); 
// 数字,则不是汉字 
if (!isNaN(parseInt(nextC))){ 
i++; 
strOut += decodeURIComponent(c+nextC+str.charAt(i)); 
} 
else{ 
var x = new String(); 
try 
{ 
var code = str.substr(i,2)+str.substr(i+3,2); 
i = i + 4; 
var index = -1; 
while ((index = z.indexOf(code,index+1)) != -1){ 
if (index%4 == 0){ 
strOut += String.fromCharCode(index/4+19968); 
break; 
} 
} 
}catch(e){} 
} 
} 
} 
return strOut; 
} 
var z=&#39;{0}&#39;;

(Chinese punctuation is not considered here, the main reason is that Chinese punctuation and Japanese and Korean punctuation are mixed together in Unicode, and they are distributed in several places , I’m too lazy to do it. If anyone has it, can you send me a copy?)
Finally use .NET to generate the code at z:

Code

StringBuilder sb = new StringBuilder(); 
string strFormat = @"...z = &#39;"; // 前面的js代码 
const int MinHanzi = 19968; 
const int MaxHanzi = 40869; 
for (int i = MinHanzi; i < MaxHanzi + 1; i++) 
{ 
byte[] bytes = Encoding.GetEncoding("gb2312").GetBytes(((char)i).ToString()); 
sb.AppendFormat("{0}{1}", Convert.ToString(bytes[0], 16).ToUpper(), Convert.ToString(bytes[1], 16).ToUpper()); 
} 
string str = strFormat + sb.ToString(0, sb.Length - 1) + "&#39;;"; 
System.IO.File.WriteAllText(@"F:\encodeGb2312.js", str, Encoding.ASCII);

More js below. For related articles on how to implement gb2312 encoding and decoding, please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn