Introduction to server-side character encoding, decoding and garbled processing methods in Nodejs-JS Tutorial-php.cn

Home

Web Front-end

JS Tutorial

Introduction to server-side character encoding, decoding and garbled processing methods in Nodejs

巴扎黑

Sep 05, 2017 am 09:48 AM

javascriptnodejsGarbled characters

This article mainly introduces the advanced server-side character encoding and decoding and garbled processing of Nodejs. It has certain reference value. Interested friends can refer to it

Written in front

In web server development, character encoding and decoding have to be dealt with almost every day. Once the encoding and decoding is not handled properly, troublesome garbled characters will occur.

Many students who are engaged in node server development often find themselves at a loss when encountering problems due to insufficient knowledge of character encoding codes and spend a lot of time troubleshooting and solving problems.

The text first briefly introduces the basic knowledge of character encoding and decoding, then gives an example of how to encode and decode in node, and finally is a server-side code example. Code examples related to this article can be found here.

About character encoding and decoding

In the process of network communication, binary bits are transmitted, regardless of whether the content sent is text or pictures, the language used Is it Chinese or English.

For example, the client sends "Hello" to the server.

Client --- Hello ---> Server

This contains two key steps, corresponding to encoding and decoding.

1. Client: Encode the string "Hello" into the binary bits required by the computer network.

2. Server: Decode the received binary bits into the string "Hello".

To summarize:

1. Encoding: Convert the data to be transmitted into the corresponding binary bits.

2. Decoding: Convert binary bits into original data.

Some important technical details are not mentioned above, the answers are in the next section.

How does the client know the number of bits corresponding to the character "Hello"?
After the server receives the binary bits, how does it know what the corresponding string is?

About character set and character encoding

The character and binary conversion issues are mentioned above. Since the two can be converted to each other, that is to say, there are clear conversion rules, and the characters can be converted into binary.

The conversion rules mentioned here are actually the character sets & character encodings we often hear.

Character set is a collection of a series of characters (text, punctuation marks, etc.). There are many character sets, common ones include ASCII, Unicode, GBK, etc. The main difference between different character sets is the number of characters they contain.

After understanding the concept of character set, let’s introduce character encoding.

The character set tells us which characters are supported, but how to encode specific characters is determined by the character encoding. For example, the Unicode character set supports character encodings such as UTF8 (commonly used), UTF16, and UTF32.

To summarize:

Character set: A collection of characters. Different character sets contain different numbers of characters.
Character encoding: The actual encoding of characters in the character set.
A character set may have multiple character encoding methods.

Character encoding can be regarded as a mapping table. The client and server use this mapping table to implement character and binary encoding and decoding conversion.

For example, the character "you" occupies three bytes 0xe4 0xbd 0xa0 in UTF8 encoding, and occupies two bytes 0xc4 0xe3 in GBK encoding.

Character encoding and decoding examples

The basic knowledge required for character encoding and decoding has been mentioned above. Let's look at a simple example below, where we use the icon-lite library to help us implement encoding and decoding operations.

As you can see, we use gbk when encoding characters. When decoding, if you also use gbk, you can get the original characters. When we use utf8 when decoding, garbled characters appear.

var iconv = require(&#39;iconv-lite&#39;);

var oriText = &#39;你&#39;;

var encodedBuff = iconv.encode(oriText, &#39;gbk&#39;);
console.log(encodedBuff);
// <Buffer c4 e3>

var decodedText = iconv.decode(encodedBuff, &#39;gbk&#39;);
console.log(decodedText);
// 你

var wrongText = iconv.decode(encodedBuff, &#39;utf8&#39;);
console.log(wrongText);
// ��

Practical example: Server-side encoding and decoding

Usually the scenarios where we need to handle encoding and decoding include file reading and writing, and network requests deal with. Here is an example of a network request, introducing how to encode and decode on the server side.

Suppose we are running the following http service, listening for requests from clients. The client uses gbk encoding when transmitting data, while the server uses utf8 encoding by default.

If the default utf8 is used to decode the request at this time, garbled characters will appear, so special processing is required.

The server code is as follows (to simplify the code, the judgment of request method and request encoding is skipped here)

var http = require(&#39;http&#39;);
var iconv = require(&#39;iconv-lite&#39;);

// 假设客户端采用post方法，编码为gbk
var server = http.createServer(function (req, res) {
  var chunks = [];
  
  req.on(&#39;data&#39;, function (chunk) {
    chunks.push(chunk)
  });

  req.on(&#39;end&#39;, function () {
    chunks = Buffer.concat(chunks);

    // 对二进制进行解码
    var body = iconv.decode(chunks, &#39;gbk&#39;);
    console.log(body);

    res.end(&#39;HELLO FROM SERVER&#39;);
  });

});

server.listen(3000);

The corresponding client code is as follows:

var http = require(&#39;http&#39;);
var iconv = require(&#39;iconv-lite&#39;);

var charset = &#39;gbk&#39;;

// 对字符"你"进行编码
var reqBuff = iconv.encode(&#39;你&#39;, charset);

var options = {
  hostname: &#39;127.0.0.1&#39;,
  port: &#39;3000&#39;,
  path: &#39;/&#39;,
  method: &#39;POST&#39;,
  headers: {
    &#39;Content-Type&#39;: &#39;text/plain&#39;,
    &#39;Content-Encoding&#39;: &#39;identity&#39;,
    &#39;Charset&#39;: charset // 设置请求字符集编码
  }
};

var client = http.request(options, function(res) {
  res.pipe(process.stdout);
});

client.end(reqBuff);

The above is the detailed content of Introduction to server-side character encoding, decoding and garbled processing methods in Nodejs. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Understanding the JavaScript Engine: Implementation DetailsApr 17, 2025 am 12:05 AM

Understanding how JavaScript engine works internally is important to developers because it helps write more efficient code and understand performance bottlenecks and optimization strategies. 1) The engine's workflow includes three stages: parsing, compiling and execution; 2) During the execution process, the engine will perform dynamic optimization, such as inline cache and hidden classes; 3) Best practices include avoiding global variables, optimizing loops, using const and lets, and avoiding excessive use of closures.

Python vs. JavaScript: The Learning Curve and Ease of UseApr 16, 2025 am 12:12 AM

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

Python vs. JavaScript: Community, Libraries, and ResourcesApr 15, 2025 am 12:16 AM

Python and JavaScript have their own advantages and disadvantages in terms of community, libraries and resources. 1) The Python community is friendly and suitable for beginners, but the front-end development resources are not as rich as JavaScript. 2) Python is powerful in data science and machine learning libraries, while JavaScript is better in front-end development libraries and frameworks. 3) Both have rich learning resources, but Python is suitable for starting with official documents, while JavaScript is better with MDNWebDocs. The choice should be based on project needs and personal interests.

From C/C to JavaScript: How It All WorksApr 14, 2025 am 12:05 AM

The shift from C/C to JavaScript requires adapting to dynamic typing, garbage collection and asynchronous programming. 1) C/C is a statically typed language that requires manual memory management, while JavaScript is dynamically typed and garbage collection is automatically processed. 2) C/C needs to be compiled into machine code, while JavaScript is an interpreted language. 3) JavaScript introduces concepts such as closures, prototype chains and Promise, which enhances flexibility and asynchronous programming capabilities.

JavaScript Engines: Comparing ImplementationsApr 13, 2025 am 12:05 AM

Different JavaScript engines have different effects when parsing and executing JavaScript code, because the implementation principles and optimization strategies of each engine differ. 1. Lexical analysis: convert source code into lexical unit. 2. Grammar analysis: Generate an abstract syntax tree. 3. Optimization and compilation: Generate machine code through the JIT compiler. 4. Execute: Run the machine code. V8 engine optimizes through instant compilation and hidden class, SpiderMonkey uses a type inference system, resulting in different performance performance on the same code.

Beyond the Browser: JavaScript in the Real WorldApr 12, 2025 am 12:06 AM

JavaScript's applications in the real world include server-side programming, mobile application development and Internet of Things control: 1. Server-side programming is realized through Node.js, suitable for high concurrent request processing. 2. Mobile application development is carried out through ReactNative and supports cross-platform deployment. 3. Used for IoT device control through Johnny-Five library, suitable for hardware interaction.

Building a Multi-Tenant SaaS Application with Next.js (Backend Integration)Apr 11, 2025 am 08:23 AM

I built a functional multi-tenant SaaS application (an EdTech app) with your everyday tech tool and you can do the same. First, what’s a multi-tenant SaaS application? Multi-tenant SaaS applications let you serve multiple customers from a sing

How to Build a Multi-Tenant SaaS Application with Next.js (Frontend Integration)Apr 11, 2025 am 08:22 AM

This article demonstrates frontend integration with a backend secured by Permit, building a functional EdTech SaaS application using Next.js. The frontend fetches user permissions to control UI visibility and ensures API requests adhere to role-base

See all articles