Why is reading and writing files in Go so much slower than Perl?-Golang-php.cn

Home

Backend Development

Golang

Why is reading and writing files in Go so much slower than Perl?

王林

Feb 09, 2024 pm 09:30 PM

standard library

为什么 Go 中读写文件比 Perl 慢很多？

Why is reading and writing files in Go much slower than Perl? This is a common problem that many developers encounter when using these two programming languages. In this article, PHP editor Strawberry will answer this question for you. When comparing the speed of reading and writing files between Go and Perl, we need to consider two key factors: language features and underlying implementation. The design philosophy of the Go language in terms of file reading and writing is different from that of Perl, which leads to differences in performance. At the same time, the underlying implementation is also an important factor affecting the reading and writing speed. Next, we will analyze these factors in detail to help you better understand why reading and writing files in Go is much slower than Perl.

Question content

I use go to improve code efficiency, but when I use go to read and write files, I find that its reading and writing efficiency is not as high as perl. Is it a problem with my code or other reasons?

Build input file:

# input file:
for i in $(seq 1 600000) do     echo server$((random%800+100)),$random,$random,$random >> sample.csv done

Read and write files using perl:

time cat sample.csv | perl -ne 'chomp;print"$_"' > out.txt

real    0m0.249s
user    0m0.083s
sys 0m0.049s

Use go to read and write files:

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
    "strings"
)

func main() {

    filepath := "./sample.csv"
    file, err := os.openfile(filepath, os.o_rdwr, 0666)
    if err != nil {
        fmt.println("open file error!", err)
        return
    }
    defer file.close()
    buf := bufio.newreader(file)
    for {
        line, err := buf.readstring('\n')
        line = strings.trimspace(line)
        fmt.println(line)
        if err != nil {
            if err == io.eof {
                fmt.println("file read ok!")
                break
            } else {
                fmt.println("read file error!", err)
                return
            }
        }
    }
}

Then I run:

time go run read.go > out.txt

real    0m2.332s
user    0m0.326s
sys 0m2.038s

Why is the read and write speed of go nearly 10 times slower than perl?

Solution

You are comparing apples to oranges.

There are at least two method errors:

Your perl spell measures cat How to read a file and send its contents via pipe(2) while perl Read the data from there, process it and write the results to its standard output.
Your Go spell
- Measure the complete build process of the go tool chain (including compiling, linking and writing out the executable image file) Then run components of a compiled program, and
- Measures unbuffered writes to stdout (fmt.print* calls) while writing to stdout in perl code - Quoting Documentation - "If Output to the terminal, usually line buffered, otherwise block buffered."

Let’s try to compare apples to apples.

First, here is a similar go implementation:

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "os"
)

func main() {
    in := bufio.newscanner(os.stdin)
    out := bufio.newwriter(os.stdout)

    for in.scan() {
        s := bytes.trimspace(in.bytes())

        if _, err := out.write(s); err != nil {
            fmt.fprint(os.stderr, "failed to write file:", err)
            os.exit(1)
        }
    }

    if err := out.flush(); err != nil {
        fmt.fprint(os.stderr, "failed to write file:", err)
        os.exit(1)
    }

    if err := in.err(); err != nil {
        fmt.fprint(os.stderr, "reading failed:", err)
        os.exit(1)
    }
}

Let’s save it as chomp.go and measure it:

Build code:

$ go build chomp.go
Generate input file:

$ for i in $(seq 1 600000);Execute echo server$((random�0 100)),$random,$random,$random;Complete>sample.csv

Run perl code:

$ time { perl -ne 'chomp; print "$_";' <sample.csv >out1.txt; }

real    0m0.226s
user    0m0.102s
sys 0m0.048s

Run it again to make sure it has read the input file from the file system cache:
```
$ time { perl -ne 'chomp; print "$_";' <sample.csv >out1.txt; }

real   0m0.123s
user   0m0.090s
sys    0m0.033s
```
Notice how the execution time is reduced.

Run go code on cached input:

$ time { ./chomp <sample.csv >out2.txt; }

real   0m0.063s
user   0m0.032s
sys    0m0.032s

Make sure the results are the same:

$ cmp out1.txt out2.txt

As you can see, on my linux/amd64 system with an ssd, the results are roughly the same.

Well I should also point out that in order to get reasonable results you would need to run each command say 1000 times and average the results in each batch and then compare the numbers, but I think this is enough to prove What is the problem with your approach.

One more thing to consider: the running time of both programs is overwhelmingly dominated by filesystem i/o, so if you think go will be faster, your expectations are unfounded: both Most of the time, the program sleep calls read(2) and write(2) in the kernel system. A go program might be faster than a perl program in some cases involving cpu operations (especially if it's written to take advantage of a multi-core system), but that's not the case at all with your example.

Oh, just to clarify the unstated fact: although the go language specification does not specify aot, and go run is a hack for one-time one-time gigs, No Serious work, nor execution of code of any serious complexity. In short, go-that-you-are-using is not an interpreted language, although the availability of go run may make it appear so. In fact, it does what a normal go build would do and then runs the resulting executable and then discards it.

You might be tempted to say that perl also handles "source code", but the perl interpreter is highly optimized for handling scripts and go's build toolchain - while being incredibly fast compared to most other compiled languages - Not optimized for this.
Perhaps the more obvious difference is that the perl interpreter actually interprets your (very simple) script, whereas chomp and print are so-called "built-in functions ”, easily provided for script execution by the interpreter. In contrast to building a go program, where the compiler parses the source code files and converts them into machine code, the linker actually reads the files for the compiled package of the go standard library - all of which are imported, - From them, combine all this machine code and write out an executable image file (which is much like the perl binary itself!); of course this is a very resource-intensive process that has nothing to do with the actual program execution .

The above is the detailed content of Why is reading and writing files in Go so much slower than Perl?. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:stackoverflow. If there is any infringement, please contact admin@php.cn delete

Learn Go String Manipulation: Working with the 'strings' PackageMay 09, 2025 am 12:07 AM

Go's "strings" package provides rich features to make string operation efficient and simple. 1) Use strings.Contains() to check substrings. 2) strings.Split() can be used to parse data, but it should be used with caution to avoid performance problems. 3) strings.Join() is suitable for formatting strings, but for small datasets, looping = is more efficient. 4) For large strings, it is more efficient to build strings using strings.Builder.

Go: String Manipulation with the Standard 'strings' PackageMay 09, 2025 am 12:07 AM

Go uses the "strings" package for string operations. 1) Use strings.Join function to splice strings. 2) Use the strings.Contains function to find substrings. 3) Use the strings.Replace function to replace strings. These functions are efficient and easy to use and are suitable for various string processing tasks.

Mastering Byte Slice Manipulation with Go's 'bytes' Package: A Practical GuideMay 09, 2025 am 12:02 AM

ThebytespackageinGoisessentialforefficientbyteslicemanipulation,offeringfunctionslikeContains,Index,andReplaceforsearchingandmodifyingbinarydata.Itenhancesperformanceandcodereadability,makingitavitaltoolforhandlingbinarydata,networkprotocols,andfileI

Learn Go Binary Encoding/Decoding: Working with the 'encoding/binary' PackageMay 08, 2025 am 12:13 AM

Go uses the "encoding/binary" package for binary encoding and decoding. 1) This package provides binary.Write and binary.Read functions for writing and reading data. 2) Pay attention to choosing the correct endian (such as BigEndian or LittleEndian). 3) Data alignment and error handling are also key to ensure the correctness and performance of the data.

Go: Byte Slice Manipulation with the Standard 'bytes' PackageMay 08, 2025 am 12:09 AM

The"bytes"packageinGooffersefficientfunctionsformanipulatingbyteslices.1)Usebytes.Joinforconcatenatingslices,2)bytes.Bufferforincrementalwriting,3)bytes.Indexorbytes.IndexByteforsearching,4)bytes.Readerforreadinginchunks,and5)bytes.SplitNor

Go encoding/binary package: Optimizing performance for binary operationsMay 08, 2025 am 12:06 AM

Theencoding/binarypackageinGoiseffectiveforoptimizingbinaryoperationsduetoitssupportforendiannessandefficientdatahandling.Toenhanceperformance:1)Usebinary.NativeEndianfornativeendiannesstoavoidbyteswapping.2)BatchReadandWriteoperationstoreduceI/Oover

Go bytes package: short reference and tipsMay 08, 2025 am 12:05 AM

Go's bytes package is mainly used to efficiently process byte slices. 1) Using bytes.Buffer can efficiently perform string splicing to avoid unnecessary memory allocation. 2) The bytes.Equal function is used to quickly compare byte slices. 3) The bytes.Index, bytes.Split and bytes.ReplaceAll functions can be used to search and manipulate byte slices, but performance issues need to be paid attention to.

Go bytes package: practical examples for byte slice manipulationMay 08, 2025 am 12:01 AM

The byte package provides a variety of functions to efficiently process byte slices. 1) Use bytes.Contains to check the byte sequence. 2) Use bytes.Split to split byte slices. 3) Replace the byte sequence bytes.Replace. 4) Use bytes.Join to connect multiple byte slices. 5) Use bytes.Buffer to build data. 6) Combined bytes.Map for error processing and data verification.

See all articles