Home  >  Article  >  Web Front-end  >  The use of Stream-readable streams in Node.js

The use of Stream-readable streams in Node.js

hzc
hzcforward
2020-06-17 09:26:041969browse

A readable stream is a stream that produces data for program consumption. Common data production methods include reading disk files, reading network request content, etc. Take a look at the previous example of what a stream is:

const rs = fs.createReadStream(filePath);

rs is a readable stream, and its method of producing data is to read The disk file, console process.stdin is also a readable stream:

process.stdin.pipe(process.stdout);

You can print out the console input with a simple sentence. The way process.stdin produces data is to read the user's input in the console. input of.

Look back at the definition of readable streams:

Readable streams are streams that produce data for program consumption.

Custom readable stream


In addition to the system provided

fs.CreateReadStream

Used gulp or vinyl The src method provided by -fs also uses readable streams

gulp.src(['*.js', 'dist/**/*.scss'])

If you want to produce data in a specific way and hand it over to the program for consumption, how do you start?

You can do it in two simple steps

  1. Inherit the

    Readable

    class of the stream module
  2. Rewrite

    _read

    method, call

    this.push

    to put the produced data into the queue to be read

Readable class has Most of the work to be done by the readable stream has been completed. You only need to inherit it, and then write the method of producing data in the _read method to implement a custom readable stream.

For example: implement a stream that generates a random number every 100 milliseconds (of no use)

const Readable = require('stream').Readable;
class RandomNumberStream extends Readable {
    constructor(max) {
        super()
    }
    _read() {
        const ctx = this;
        setTimeout(() => {
            const randomNumber = parseInt(Math.random() * 10000);
            // 只能 push 字符串或 Buffer,为了方便显示打一个回车
            ctx.push(`${randomNumber}\n`);
        }, 100);
    }
}
module.exports = RandomNumberStream;

The class inheritance part of the code is very simple, mainly look at the implementation of the _read method, there are several A noteworthy point

  1. The Readable class has the implementation of the _read method by default, but nothing is done. What we do is to override and override the
  2. _read method has a parameter size , used to specify how much data should be read and returned to the read method, but it is just a reference data. Many implementations ignore this parameter, and we also ignore it here. We will mention it in detail later
  3. Push to the buffer through this. Push data, the concept of buffer will be mentioned later. For the time being, it is understood that it can be consumed after being squeezed into the water pipe.
  4. The content of push can only be strings or Buffers, not numbers.
  5. The push method has The second parameter encoding is used to specify encoding when the first parameter is a string

Execute it to see the effect

const RandomNumberStream = require('./RandomNumberStream');
const rns = new RandomNumberStream();
rns.pipe(process.stdout);

In this way, you can see that the numbers are continuously displayed in the control On the platform, a readable stream that generates random numbers has been implemented. There are still a few small problems to be solved

How to stop

Push a number to the buffer every 100 milliseconds, then For example, when reading a local file, it is always finished. How to stop and indicate that the data has been read?

Just push a null to the buffer. Modify the code to allow consumers to define how many random numbers are needed:

const Readable = require('stream').Readable;
class RandomNumberStream extends Readable {
    constructor(max) {
        super()
        this.max = max;
    }
    _read() {
        const ctx = this;
        setTimeout(() => {
            if (ctx.max) {
                const randomNumber = parseInt(Math.random() * 10000);
                // 只能 push 字符串或 Buffer,为了方便显示打一个回车
                ctx.push(`${randomNumber}\n`);
                ctx.max -= 1;
            } else {
                ctx.push(null);
            }
        }, 100);
    }
}
module.exports = RandomNumberStream;

The code uses a max identifier to allow consumers to specify the needs The number of characters can be specified when instantiating

const RandomNumberStream = require('./');
const rns = new RandomNumberStream(5);
rns.pipe(process.stdout);

In this way, you can see that the console only prints 5 characters

Why is it setTimeout instead of setInterval

Be careful Students may have noticed that generating a random number every 100 milliseconds does not call setInterval, but uses setTimeout. Why is it just delayed and not repeated, but the result is correct?

This requires understanding the two ways in which streams work

  1. Flow mode: Data is read out by the underlying system and provided to the application as quickly as possible
  2. Pause mode: The read() method must be called explicitly to read several data blocks

The stream is in pause mode by default, which means that the program needs to explicitly call the read() method. But in the above example, the data can be obtained without calling, because the stream is switched to flow mode through the pipe() method, so the _read() method will automatically be called repeatedly until the data is read, so each time _read() You only need to read the data once in the method

Switching between flow mode and pause mode

The following methods can be used to switch the stream from the default pause mode to flow mode:

  1. Start data monitoring by adding a data event listener
  2. Call the resume() method to start the data flow
  3. Call the pipe() method to transfer the data to another writable stream

There are two ways to switch from flow mode to pause mode:

  1. When the stream does not have pipe(), calling the pause() method can pause the stream
  2. pipe() Remove all data event listeners, and then call the unpipe() method

data event

After using the pipe() method, the data will be streamed from the readable stream It has entered a writable stream, but it seems to be a black box to the user. How does the data flow? There are two important terms when switching between flow mode and pause mode

  1. The data event corresponding to the flow mode
  2. The read() method corresponding to the pause mode

These two mechanisms are the reason why the program can drive data flow. Let's take a look at the flow mode data event first. Once the data event of the readable stream is monitored, the stream enters the flow mode. You can rewrite the code that calls the stream above.

const RandomNumberStream = require('./RandomNumberStream');
const rns = new RandomNumberStream(5);
rns.on('data', chunk => {
  console.log(chunk);
});

这样可以看到控制台打印出了类似下面的结果

<Buffer 39 35 37 0a>
<Buffer 31 30 35 37 0a>
<Buffer 38 35 31 30 0a>
<Buffer 33 30 35 35 0a>
<Buffer 34 36 34 32 0a>

当可读流生产出可供消费的数据后就会触发 data 事件,data 事件监听器绑定后,数据会被尽可能地传递。data 事件的监听器可以在第一个参数收到可读流传递过来的 Buffer 数据,这也就是控制台打印的 chunk,如果想显示为数字,可以调用 Buffer 的 toString() 方法

当数据处理完成后还会触发一个

end

事件,因为流的处理不是同步调用,所以如果希望完事后做一些事情就需要监听这个事件,在代码最后追加一句:

rns.on('end', () => {
  console.log('done');
});复制代码

这样可以在数据接收完了显示 done ,当然数据处理过程中出现了错误会触发 error 事件,可以监听做异常处理:

rns.on('error', (err) => {
  console.log(err);
});复制代码

read(size)

流在暂停模式下需要程序显式调用 read() 方法才能得到数据,read() 方法会从内部缓冲区中拉取并返回若干数据,当没有更多可用数据时,会返回null

使用 read() 方法读取数据时,如果传入了 size 参数,那么它会返回指定字节的数据;当指定的size字节不可用时,则返回null。如果没有指定size参数,那么会返回内部缓冲区中的所有数据

现在有一个矛盾,在流动模式下流生产出了数据,然后触发 data 事件通知给程序,这样很方便。在暂停模式下需要程序去读取,那么就有一种可能是读取的时候还没生产好,如果使用轮询的方式未免效率有些低

NodeJS 提供了一个

readable的事件,事件在可读流准备好数据的时候触发,也就是先监听这个事件,收到通知有数据了再去读取就好了:

const rns = new RandomNumberStream(5);
rns.on('readable', () => {
  let chunk;
  while((chunk = rns.read()) !== null){
    console.log(chunk);
  }
});

这样可以读取到数据,值得注意的一点是并不是每次调用 read() 方法都可以返回数据,前面提到了如果可用的数据没有达到 size 那么返回 null,所以在程序中加了个判断

数据会不会漏掉

const stream = fs.createReadStream('/dev/input/event0');
stream.on('readable', callback);复制代码

在流动模式会不会有这样的问题:可读流在创建好的时候就生产数据了,如果在绑定 readable 事件之前就生产了某些数据,触发了 readable 事件,在极端情况下会造成数据丢失吗?

事实并不会,按照 NodeJS event loop 程序创建流和调用事件监听在一个事件队列里面,生产数据和事件监听都是异步操作,而 on 监听事件使用了 process.nextTick 会保证在数据生产之前被绑定好,相关知识可以看定时器章节中对 event loop 的解读

到这里可能对 data事件、readable事件触发时机, read() 方法每次读多少数据,什么时候返回 null 还有一定的疑问,在后续可写流章节会在 back pressure 部分结合源码介绍相关机制

推荐教程:《JS教程

The above is the detailed content of The use of Stream-readable streams in Node.js. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete