out."/> out.">

Home  >  Article  >  Backend Development  >  Why is reading and writing files in Go so much slower than Perl?

Why is reading and writing files in Go so much slower than Perl?

王林
王林forward
2024-02-09 21:30:24897browse

为什么 Go 中读写文件比 Perl 慢很多?

Why is reading and writing files in Go much slower than Perl? This is a common problem that many developers encounter when using these two programming languages. In this article, PHP editor Strawberry will answer this question for you. When comparing the speed of reading and writing files between Go and Perl, we need to consider two key factors: language features and underlying implementation. The design philosophy of the Go language in terms of file reading and writing is different from that of Perl, which leads to differences in performance. At the same time, the underlying implementation is also an important factor affecting the reading and writing speed. Next, we will analyze these factors in detail to help you better understand why reading and writing files in Go is much slower than Perl.

Question content

I use go to improve code efficiency, but when I use go to read and write files, I find that its reading and writing efficiency is not as high as perl. Is it a problem with my code or other reasons?

Build input file:

# input file:
for i in $(seq 1 600000) do     echo server$((random%800+100)),$random,$random,$random >> sample.csv done

Read and write files using perl:

time cat sample.csv | perl -ne 'chomp;print"$_"' > out.txt
real    0m0.249s
user    0m0.083s
sys 0m0.049s

Use go to read and write files:

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
    "strings"
)

func main() {

    filepath := "./sample.csv"
    file, err := os.openfile(filepath, os.o_rdwr, 0666)
    if err != nil {
        fmt.println("open file error!", err)
        return
    }
    defer file.close()
    buf := bufio.newreader(file)
    for {
        line, err := buf.readstring('\n')
        line = strings.trimspace(line)
        fmt.println(line)
        if err != nil {
            if err == io.eof {
                fmt.println("file read ok!")
                break
            } else {
                fmt.println("read file error!", err)
                return
            }
        }
    }
}

Then I run:

time go run read.go > out.txt
real    0m2.332s
user    0m0.326s
sys 0m2.038s

Why is the read and write speed of go nearly 10 times slower than perl?

Solution

You are comparing apples to oranges.

There are at least two method errors:

  1. Your perl spell measures cat How to read a file and send its contents via pipe(2) while perl Read the data from there, process it and write the results to its standard output.

  2. Your Go spell

    • Measure the complete build process of the go tool chain (including compiling, linking and writing out the executable image file) Then run components of a compiled program, and
    • Measures unbuffered writes to stdout (fmt.print* calls) while writing to stdout in perl code - Quoting Documentation - "If Output to the terminal, usually line buffered, otherwise block buffered."

Let’s try to compare apples to apples.

First, here is a similar go implementation:

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "os"
)

func main() {
    in := bufio.newscanner(os.stdin)
    out := bufio.newwriter(os.stdout)

    for in.scan() {
        s := bytes.trimspace(in.bytes())

        if _, err := out.write(s); err != nil {
            fmt.fprint(os.stderr, "failed to write file:", err)
            os.exit(1)
        }
    }

    if err := out.flush(); err != nil {
        fmt.fprint(os.stderr, "failed to write file:", err)
        os.exit(1)
    }

    if err := in.err(); err != nil {
        fmt.fprint(os.stderr, "reading failed:", err)
        os.exit(1)
    }
}

Let’s save it as chomp.go and measure it:

  1. Build code:

    $ go build chomp.go

  2. Generate input file:

    $ for i in $(seq 1 600000);Execute echo server$((random�0 100)),$random,$random,$random;Complete>sample.csv

  3. Run perl code:

    $ time { perl -ne 'chomp; print "$_";' <sample.csv >out1.txt; }
    
    real    0m0.226s
    user    0m0.102s
    sys 0m0.048s
  4. Run it again to make sure it has read the input file from the file system cache:

    $ time { perl -ne 'chomp; print "$_";' <sample.csv >out1.txt; }
    
    real   0m0.123s
    user   0m0.090s
    sys    0m0.033s

    Notice how the execution time is reduced.

  5. Run go code on cached input:

    $ time { ./chomp <sample.csv >out2.txt; }
    
    real   0m0.063s
    user   0m0.032s
    sys    0m0.032s
  6. Make sure the results are the same:

    $ cmp out1.txt out2.txt

As you can see, on my linux/amd64 system with an ssd, the results are roughly the same.

Well I should also point out that in order to get reasonable results you would need to run each command say 1000 times and average the results in each batch and then compare the numbers, but I think this is enough to prove What is the problem with your approach.

One more thing to consider: the running time of both programs is overwhelmingly dominated by filesystem i/o, so if you think go will be faster, your expectations are unfounded: both Most of the time, the program sleep calls read(2) and write(2) in the kernel system. A go program might be faster than a perl program in some cases involving cpu operations (especially if it's written to take advantage of a multi-core system), but that's not the case at all with your example.

Oh, just to clarify the unstated fact: although the go language specification does not specify aot, and go run is a hack for one-time one-time gigs, No Serious work, nor execution of code of any serious complexity. In short, go-that-you-are-using is not an interpreted language, although the availability of go run may make it appear so. In fact, it does what a normal go build would do and then runs the resulting executable and then discards it.

You might be tempted to say that perl also handles "source code", but the perl interpreter is highly optimized for handling scripts and go's build toolchain - while being incredibly fast compared to most other compiled languages ​​- Not optimized for this.
Perhaps the more obvious difference is that the perl interpreter actually interprets your (very simple) script, whereas chomp and print are so-called "built-in functions ”, easily provided for script execution by the interpreter. In contrast to building a go program, where the compiler parses the source code files and converts them into machine code, the linker actually reads the files for the compiled package of the go standard library - all of which are imported, - From them, combine all this machine code and write out an executable image file (which is much like the perl binary itself!); of course this is a very resource-intensive process that has nothing to do with the actual program execution .

The above is the detailed content of Why is reading and writing files in Go so much slower than Perl?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete