Home  >  Article  >  Backend Development  >  How to read and format a text stream received through a bash pipe?

How to read and format a text stream received through a bash pipe?

WBOY
WBOYforward
2024-02-10 23:30:09461browse

如何读取和格式化通过 bash 管道接收的文本流?

In our daily work, we often need to process text data through command line tools. In Linux systems, bash pipe (pipe) is a very powerful tool that can use the output of one command as the input of another command. But when we receive a large text stream through a pipe, how do we efficiently read and format this data? This article will introduce you to some practical tips and methods to help you better handle text streams received through bash pipes. Whether you are a beginner or an experienced developer, this article will bring you some inspiration and help.

Question content

Currently, I'm using the following to format data in an npm script.

npm run startwin | while ifs= read -r line; do printf '%b\n' "$line"; done | less

It works, but my coworker doesn't use linux. So, I want to implement while ifs= read -r line; execute printf '%b\n' "$line"; done in go and use the binary file in the pipeline.

npm run startwin | magical-go-formater

What I tried

package main

import (
    "fmt"
    "io/ioutil"
    "os"
    "strings"
)

func main() {
  fi, _ := os.Stdin.Stat() // get the FileInfo struct

  if (fi.Mode() & os.ModeCharDevice) == 0 {

    bytes, _ := ioutil.ReadAll(os.Stdin)
    str := string(bytes)
    arr := strings.Fields(str)

    for _, v := range arr {
      fmt.Println(v)
    }
}

Currently, the program silences all output of the text stream.

Workaround

You want to use bufio.scanner for tail type reading. IMHO the check you did on os.stdin is unnecessary, but ymmv.

See this answer for an example. ioutil.readall() (now deprecated, just use io.readall()) reads errors/eof, but it's not looping over input - that's what you needbufio.scanner.scan() reason. p>

Additionally - %b will convert any escape sequences in the text - e.g. any \n in the passed line will be rendered as a newline - do you need that? b/c go has no equivalent format specifier, afaik.

edit

So I think, your approach based on readall() will/might work...eventually. I guess the behavior you expect is similar to bufio.scanner - the receiving process processes bytes as they are written (this is actually a polling operation - see scan() of the standard library source code to see the dirty details) .

But readall() buffers everything read and does not return until an error eventually occurs or eof occurs. I cracked the instrumented version of readall() (which is an exact copy of the standard library source code, with just a little extra instrumentation output) and you can see that it's reading as bytes are being written, But it just doesn't return and produce content until the writing process is complete, at which point it closes the end of the pipe (its open file handle), thus generating an eof:

package main

import (
    "fmt"
    "io"
    "os"
    "time"
)

func main() {

    // os.stdin.setreaddeadline(time.now().add(2 * time.second))

    b, err := readall(os.stdin)
    if err != nil {
        fmt.println("error: ", err.error())
    }

    str := string(b)
    fmt.println(str)
}

func readall(r io.reader) ([]byte, error) {
    b := make([]byte, 0, 512)
    i := 0
    for {
        if len(b) == cap(b) {
            // add more capacity (let append pick how much).
            b = append(b, 0)[:len(b)]
        }
        n, err := r.read(b[len(b):cap(b)])

        //fmt.fprintf(os.stderr, "read %d - received: \n%s\n", i, string(b[len(b):cap(b)]))
        fmt.fprintf(os.stderr, "%s read %d - received %d bytes\n", time.now(), i, n)
        i++

        b = b[:len(b)+n]
        if err != nil {
            if err == io.eof {
                fmt.fprintln(os.stderr, "received eof")
                err = nil
            }
            return b, err
        }
    }
}

I just wrote a cheap script to generate input, simulate some long running stuff and only write periodically, I imagine how npm would behave in your case:

#!/bin/sh

for x in 1 2 3 4 5 6 7 8 9 10
do
  cat ./main.go
  sleep 10
done

BTW, I find reading the actual standard library code really helpful... or at least interesting in cases like this.

The above is the detailed content of How to read and format a text stream received through a bash pipe?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete