Home  >  Article  >  Web Front-end  >  Learn more about Buffers in Node

Learn more about Buffers in Node

青灯夜游
青灯夜游forward
2023-04-25 19:49:112296browse

Learn more about Buffers in Node

At the end of the Stream chapter, we are left with a question, what is the chunk output by the following code?

Learn more about Buffers in Node

Through printing, we find that chunk is a Buffer object, the elements of which are hexadecimal two-digit numbers, that is, values ​​from 0 to 255. [Related tutorial recommendations: nodejs video tutorial, Programming teaching]

Untitled 1.png

Explain that the data flowing in the Stream is the Buffer, then Let's explore the true face of Buffer!

? Why is Buffer introduced in Node?

At the beginning, JS only ran on the browser side. Unicode-encoded strings were easy to process, but for binary Difficulty handling strings with non-Unicode encodings. And binary is the lowest level data format of the computer. Video/audio/program/network packets are all stored in binary. So Node needs to introduce an object to operate binary, so Buffer was born, which is used for TCP stream/file system and other operations to process binary bytes.

Since Buffer is too commonly used in Node, Buffer has been introduced when Node starts, and there is no need to use require()

ArrayBuffer

What is it

ArrayBuffer is a piece of binary data in the memory. It cannot operate the memory itself. It needs to be operated through the TypedArray object or DataView. Represent the data in the buffer into a specific format, and read and write the contents of the buffer through these formats. It deploys an array interface and can use the array to operate data

TypedArray view

The most commonly used is the TypeArray view, which is used to read and write simple types of ArrayBuffer, such as Uint8Array (unsigned 8-bit integer) array view, Int16Array (16-bit integer) array view

Relationship with Buffer

The Buffer class in NodeJS is actually the implementation of Uint8Array.

Buffer structure

Buffer is an object similar to Array, but it is mainly used to operate bytes

Module structure

Buffer is a combination of JS and C The performance part of the module is implemented in C, and the non-performance part is implemented in JS. Untitled 2.png

The memory occupied by Buffer is not allocated by V8 and belongs to off-heap memory.

Object structure

The Buffer object is similar to an array, and its elements are two-digit hexadecimal digits, that is, values ​​from 0 to 255

Untitled 3.png

It can be seen from this example that different characters occupy different bytes in the Buffer. Under UTF-8 encoding, Chinese occupies 3 bytes, and English and half-width symbols occupy 1 byte.

? What will happen if the input element is a decimal/negative number/exceeds 255?

Untitled 4.png

For the above situation, the processing of Buffer is:

  • If the value assigned to the element is less than 0, the value will be assigned one by one Add 256 until you get an integer between 0 and 255
  • If the value obtained is greater than 255, subtract 256 one by one until you get a value between 0 and 255
  • If it is a decimal, Only the integer part is retained

Why does the Buffer display hexadecimal numbers

In fact, binary numbers are still stored in the memory, but the Buffer is displaying the memory The data uses a hexadecimal

buffer with a size of 2 bytes. There are 16 bits in total, such as 00000001 00100011. If it is not convenient to display it directly like this, convert it to 16 bits. Base<buffer></buffer>

Creation of Buffer

Buffer.alloc and Buffer.allocUnsafe

Create fixed size buffer

Buffer.alloc(size [, fill [, encoding]])

  • size The desired length of the new Buffer
  • fill The value used to prefill the new Buffer. Default: 0
  • encoding If fill is a string, this is its character encoding. Default value: utf8

Untitled 5.png

Buffer.allocUnsafe(size)

Allocate a Buffer of size bytes, allocUnsafe executes faster than alloc , we found that the results are not initialized to 00 like Buffer.alloc

Untitled 6.png

The memory segment allocated when allocUnsafe is called has not yet been initialized, so the memory allocation speed is very slow, but The allocated memory segment may contain old data. If these old data are not overwritten during use, memory leaks may occur. Although it is fast, try to avoid using it.

The Buffer module will pre-allocate an internal Buffer instance with a size of Buffer.poolSize as a quick allocation Memory pool, used to create new Buffer instances using allocUnsafe

Buffer.from

Create Buffer directly based on the content

  • Buffer.from(string [, encoding] )
  • Buffer.from(array)
  • Buffer.from(buffer)

Untitled 7.png

##Memory mechanism of Buffer.allocUnsafe

In order to efficiently use the applied memory, Node.js adopts the slab mechanism for pre-application and post-allocation, which is a dynamic management mechanism.

Use Buffer.alloc(size) to pass in a The specified size will apply for a fixed-size memory area. The slab has the following three states

    full: Fully allocated state
  • partial: Partially allocated state
  • empty: No allocated status
Node.js uses 8 KB as the limit to distinguish small objects from large objects

Untitled 8.png

Buffer The size is determined when created and cannot be adjusted!

Allocate small objects

If the allocated object is less than 8KB, Node will allocate it as a small object

The Buffer allocation process mainly uses a The local variable pool serves as an intermediate processing object, and all slab units in the allocated state point to it. The following is the operation of allocating a brand new slab unit, which will point the newly applied SlowBuffer object to it

Untitled 9.png

A slab unit

Untitled 10.png

Allocate a 2KB Buffer

After creating a 2KB buffer, a slab unit memory is as follows:

Untitled 11.png

This allocation process is performed by allocate Method completed

Untitled 12.png

After we create a 2KB buffer, the current slab status is partial

When we create the buffer again, we will judge the remaining slab size Is there enough space? If it is enough, use the remaining space and update the slab allocation status

If the slab space is not enough, a new slab will be built, and the remaining space in the original slab will be wasted

Allocate large objects

If there is a buffer exceeding 8KB, it will go directly to the creatUnsafeBuffer function and allocate a slab unit. This slab unit will be exclusively occupied by this large Buffer object.

allocate allocation mechanism is as shown in the figure

Untitled 13.png

Buffer’s memory allocation mechanism

Untitled 14.png

Buffer and character encoding

By using character encoding, Buffer instances and Conversion between JavaScript strings

Untitled 15.png

Node currently supports eight encoding methods: utf8, ucs2, utf16le, latin1, ascii, base64, hex, and base64Url. Specific implementation

Untitled 16.png

For each different encoding scheme, a series of APIs will be implemented, and different results will be returned. Node.js will return different objects according to the incoming encoding

Buffer and string conversion

Convert string to Buffer

Mainly through the Buffer.from method mentioned above, the default encoding method is utf-8

Buffer to string

Untitled 17.png

? Why are there garbled characters? How to solve this problem?

According to reading, the length of each read is 4, and the chunk output is as follows

Untitled 18.png

For data = chunk is equivalent to data = data.toString chunk.toString

Since one Chinese character occupies three bytes, the fourth byte in the first chunk will display garbled characters. , the first and second bytes of the second chunk cannot form text, etc., so the garbled problem will be displayed

For more node-related knowledge, please visit: nodejs tutorial!

The above is the detailed content of Learn more about Buffers in Node. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete