What is a stream? How to understand flow? The following article will give you an in-depth understanding of streams in Nodejs. I hope it will be helpful to you!
stream is an abstract data interface, which inherits EventEmitter. It can send/receive data. Its essence is to let the data flow, as shown below:
Stream is not a unique concept in Node, it is the most basic operation method of the operating system. In Linux | it is Stream, but it is encapsulated at the Node level and provides the corresponding API
Why do we need to do it bit by bit?
First use the following code to create a file, about 400MB [Related tutorial recommendations: nodejs video tutorial]
When we use readFile to read, the following code
#When the service is started normally, it takes up about 10MB of memory
Use curl http://127.0.0.1:8000
When making a request, the memory becomes about 420MB, which is about the same size as the file we created
Instead use the stream writing method, the code is as follows
When the request is made again, it is found that the memory only takes up about 35MB, which is significantly reduced compared to readFile
If we do not use the streaming mode and wait for the large file to be loaded before operating, there will be the following problems:
- The memory is temporarily used too much, causing the system to Crash
- The CPU operation speed is limited and it serves multiple programs. The loading of large files is too large and takes a long time
In summary, reading large files at one time consumes the memory I can’t stand it and the Internet
How can I do it bit by bit?
When we read a file, we can output the data after the reading is completed.
As mentioned above, stream inherits EventEmitter and can be implemented Monitor data. First, change the reading data to streaming reading, use on("data", ()⇒{})
to receive the data, and finally use on("end", ()⇒{} )
The final result
When data is transferred, the data event will be triggered, receive this data for processing, and finally wait for all data to be transferred. Trigger the end event.
Data flow process
Where data comes from—source
Data flows from one place to another. Let’s first look at the source of the data.
-
http request, request data from the interface
-
console console, standard input stdin
file file, read the file content, such as the above example
Connected pipe— pipe
There is a connected pipe pipe in source and dest. The basic syntax is source.pipe(dest)
. Source and dest are connected through pipe, allowing data to flow from source to dest
We don’t need to manually monitor the data/end event like the above code.
There are strict requirements when using pipe. Source must be a readable stream, and dest must be a writable stream. Stream
??? What exactly is flowing data? What is chunk in the code?
Where to go—dest
stream Three common output methods
-
console console, standard output stdout
-
http request, response in the interface request
-
file file, write file
Types of streams
##Readable StreamsReadable Streams
A readable stream is an abstraction of the source that provides dataAll Readables implement the interface defined by the stream.Readable classRead mode
There are two modes for readable streams,flowing mode and pause mode, which determines the flow mode of chunk data: automatic flow and manual flow Flow
There is a _readableState attribute in ReadableStream, in which there is a flowing attribute to determine the flow mode. It has three state values:- ture: expressed as flowing Mode
- false: Represented as pause mode
- null: Initial state
Flow mode
Data is automatically read from the bottom layer, forming a flow phenomenon, and provided to the application through events.- You can enter this mode by listening to the data event
When the data event is added, when there is data in the writable stream, the data will be pushed to the event callback function. You need to do it yourself To consume the data block, if not processed, the data will be lost
- Call the stream.pipe method to send the data to Writeable
- Call stream. resume method
Pause mode
Data will accumulate in the internal buffer and must be called explicitly stream.read() reads data blocks- #Listen to the readable event
The writable stream will trigger this event callback after the data is ready. At this time, you need to use stream.read() in the callback function to actively consume data. The readable event indicates that the stream has new dynamics: either there is new data, or the stream has read all the data
- The readable stream is in the initial state after creation //TODO: inconsistent with online sharing
- Switch pause mode to flow Mode
- 监听 data 事件 - 调用 stream.resume 方法 - 调用 stream.pipe 方法将数据发送到 Writable
- Switch flow mode to pause mode
- 移除 data 事件 - 调用 stream.pause 方法 - 调用 stream.unpipe 移除管道目标
Implementation principle
When creating a readable stream, you need to inherit the Readable object and implement the _read method- doReadA cache is maintained in the stream , when calling the read method to determine whether it is necessary to request data from the bottom layer
When the buffer length is 0 or less than the value of highWaterMark, _read will be called to get the data from the bottom layerSource code link
Writable StreamWritable Stream
The writable stream is an abstraction of the data writing destination. It is used to consume the data flowing from the upstream. Through the writable stream, Data is written to the device. A common write stream is writing to the local disk
Characteristics of writable streams
-
Write data through write
-
Write data through end and close the stream, end = write close
-
When the written data reaches the size of highWaterMark, the drain event will be triggered
Call ws.write( chunk) returns false, indicating that the current buffer data is greater than or equal to the value of highWaterMark, and the drain event will be triggered. In fact, it serves as a warning. We can still write data, but the unprocessed data will always be backlogged in the internal buffer of the writable stream until the backlog is filled with the Node.js buffer. Will it be forcibly interrupted
Customized writable stream
All Writeables implement the interface defined by the stream.Writeable class
You only need to implement the _write method to write data to the underlying layer
- Writing data to the stream by calling the writable.write method will call The _write method writes data to the bottom layer
- When _write data is successful, the next method needs to be called to process the next data
- mustcall writable.end(data) To end a writable stream, data is optional. After that, write cannot be called to add new data, otherwise an error will be reported
- After the end method is called, when all underlying write operations are completed, the finish event will be triggered
Duplex StreamDuplex Stream
Duplex stream can be both read and written. In fact, it is a stream that inherits Readable and Writable, so it can be used as both a readable stream and a writable stream.
A custom duplex stream needs to implement the _read method of Readable and The _write method of Writable
net module can be used to create a socket. The socket is a typical Duplex in NodeJS. See an example of a TCP client
client is a Duplex. The writable stream is used to send messages to the server, and the readable stream is used to receive server messages. There is no direct relationship between the data in the two streams
Transform StreamTransform Stream
In the above example, the data in the readable stream (0/1) and the data in the writable stream ('F','B','B') is isolated, and there is no relationship between the two, but for Transform, the data written on the writable side will be automatically added to the readable side after transformation.
Transform inherits from Duplex, and has already implemented the _write and _read methods. You only need to implement the _tranform method.
gulp Stream-based automation To build the tool, look at a sample code from the official website
Duplex and Transform selection
Compared with the above example, we find that when a stream serves both producers and consumers, we will choose Duplex, and when we just do some transformation work on the data, we will choose to use Transform
Backpressure problem
What is backpressure
The backpressure problem comes from the producer-consumer model, where the consumer handles The speed is too slow
For example, during our download process, the processing speed is 3Mb/s, while during the compression process, the processing speed is 1Mb/s. In this case, the buffer queue will soon accumulate
Either the memory consumption of the entire process increases, or the entire buffer is slow and some data is lost
What is back pressure processing
Backpressure processing can be understood as a process of "proclaiming" upwards
When the compression processing finds that its buffer data squeeze exceeds the threshold, it "proclaims" to the download processing. I am too busy. , don’t send it again
Download processing will pause sending data downward after receiving the message
How to deal with back pressure
We have different functions to transfer data from one process to another. In Node.js, there is a built-in function called .pipe(), and ultimately, at a basic level in this process we have two unrelated components: the source of data, and the consumer.
When .pipe() is called by the source, it notifies the consumer that there is data to be transmitted. The pipeline function establishes a suitable backlog package for event triggering
When the data cache exceeds highWaterMark or the write queue is busy, .write() will return false
When false returns, The backlog system stepped in. It will pause incoming Readables from any data stream that is sending data. Once the data stream is emptied, the drain event will be triggered, consuming the incoming data stream
Once the queue is completely processed, the backlog mechanism will allow the data to be sent again. The memory space in use will release itself and prepare to receive the next batch of data
We can see the back pressure processing of the pipe:
- Divide the data according to chunks and write them
- When the chunk is too large or the queue is busy, read is paused
- When the queue is empty, continue reading data
For more node-related knowledge, please visit: nodejs tutorial!
The above is the detailed content of An in-depth analysis of Stream in Node. For more information, please follow other related articles on the PHP Chinese website!

JavaScript core data types are consistent in browsers and Node.js, but are handled differently from the extra types. 1) The global object is window in the browser and global in Node.js. 2) Node.js' unique Buffer object, used to process binary data. 3) There are also differences in performance and time processing, and the code needs to be adjusted according to the environment.

JavaScriptusestwotypesofcomments:single-line(//)andmulti-line(//).1)Use//forquicknotesorsingle-lineexplanations.2)Use//forlongerexplanationsorcommentingoutblocksofcode.Commentsshouldexplainthe'why',notthe'what',andbeplacedabovetherelevantcodeforclari

The main difference between Python and JavaScript is the type system and application scenarios. 1. Python uses dynamic types, suitable for scientific computing and data analysis. 2. JavaScript adopts weak types and is widely used in front-end and full-stack development. The two have their own advantages in asynchronous programming and performance optimization, and should be decided according to project requirements when choosing.

Whether to choose Python or JavaScript depends on the project type: 1) Choose Python for data science and automation tasks; 2) Choose JavaScript for front-end and full-stack development. Python is favored for its powerful library in data processing and automation, while JavaScript is indispensable for its advantages in web interaction and full-stack development.

Python and JavaScript each have their own advantages, and the choice depends on project needs and personal preferences. 1. Python is easy to learn, with concise syntax, suitable for data science and back-end development, but has a slow execution speed. 2. JavaScript is everywhere in front-end development and has strong asynchronous programming capabilities. Node.js makes it suitable for full-stack development, but the syntax may be complex and error-prone.

JavaScriptisnotbuiltonCorC ;it'saninterpretedlanguagethatrunsonenginesoftenwritteninC .1)JavaScriptwasdesignedasalightweight,interpretedlanguageforwebbrowsers.2)EnginesevolvedfromsimpleinterpreterstoJITcompilers,typicallyinC ,improvingperformance.

JavaScript can be used for front-end and back-end development. The front-end enhances the user experience through DOM operations, and the back-end handles server tasks through Node.js. 1. Front-end example: Change the content of the web page text. 2. Backend example: Create a Node.js server.

Choosing Python or JavaScript should be based on career development, learning curve and ecosystem: 1) Career development: Python is suitable for data science and back-end development, while JavaScript is suitable for front-end and full-stack development. 2) Learning curve: Python syntax is concise and suitable for beginners; JavaScript syntax is flexible. 3) Ecosystem: Python has rich scientific computing libraries, and JavaScript has a powerful front-end framework.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

SublimeText3 Chinese version
Chinese version, very easy to use
