Home  >  Article  >  Backend Development  >  What is the most efficient way to read a byte file into an int64 slice?

What is the most efficient way to read a byte file into an int64 slice?

PHPz
PHPzforward
2024-02-09 11:36:09918browse

将字节文件读入 int64 切片的最有效方法是什么?

php editor Zimo is here to answer a common question: "What is the most effective way to read byte files into int64 slices?" When we need to convert bytes When a file is read into an int64 slice, the following method can be used: first, use the file_get_contents function to read the byte file, and then use the unpack function to unpack the byte file into an int64 slice. This method is simple and efficient, and can quickly convert byte files into int64 slices to meet our needs. Hope this method can help everyone!

Question content

I have several packed int64 files. I need them in memory as int64 slices. The problem is that the files combined exceed half the size of the machine's memory, so space is limited. Standard options in go look like:

a := make([]int64, f.Size()/8)
binary.Read(f, binary.LittleEndian, a)

Unfortunately, the binary package will immediately allocate a []byte of size f.size()*8 and run out of memory.

It does work if I read each byte one at a time and copy it into the slice, but this is too slow.

The ideal situation would be to convert the []byte directly to []int64 and just tell the compiler "ok, these are integers now", but obviously that won't work of. Is there any way to accomplish something similar? Maybe use an unsafe package or put in c when absolutely necessary?

Solution

I have several packed int64 files. I need them in memory as int64 slices. The problem is that the files combined exceed half the size of the machine's memory, so space is limited.

The standard options in go are similar to:

a := make([]int64, f.Size()/8)
binary.Read(f, binary.LittleEndian, a)

Unfortunately, the binary package will immediately allocate a []byte of size f.size()*8 and run out of memory.

All functions use minimal memory.

// same endian architecture and data
// most efficient (no data conversion).
func readfileint64se(filename string) ([]int64, error) {
    b, err := os.readfile(filename)
    if err != nil {
        return nil, err
    }

    const i64size = int(unsafe.sizeof(int64(0)))
    i64ptr := (*int64)(unsafe.pointer(unsafe.slicedata(b)))
    i64len := len(b) / i64size
    i64 := unsafe.slice(i64ptr, i64len)

    return i64, nil
}

For example, for maximum efficiency with amd64 (littleendian) architecture and littleendian data (no data conversion required), use readfileint64se.

Byte order fallacy - rob pike
https://commandcenter.blogspot.com/2012/04/byte- order-fallacy.html

// littleendian in-place data conversion for any architecture
func readfileint64le(filename string) ([]int64, error) {
    b, err := os.readfile(filename)
    if err != nil {
        return nil, err
    }

    const i64size = int(unsafe.sizeof(int64(0)))
    i64ptr := (*int64)(unsafe.pointer(unsafe.slicedata(b)))
    i64len := len(b) / i64size
    i64 := unsafe.slice(i64ptr, i64len)

    for i, j := i64size, 0; i <= len(b); i, j = i+i64size, j+1 {
        i64[j] = int64(binary.littleendian.uint64(b[i-i64size : i]))
    }

    return i64, nil
}
// BigEndian in-place data conversion for any architecture
func readFileInt64BE(filename string) ([]int64, error) {
    b, err := os.ReadFile(filename)
    if err != nil {
        return nil, err
    }

    const i64Size = int(unsafe.Sizeof(int64(0)))
    i64Ptr := (*int64)(unsafe.Pointer(unsafe.SliceData(b)))
    i64Len := len(b) / i64Size
    i64 := unsafe.Slice(i64Ptr, i64Len)

    for i, j := i64Size, 0; i <= len(b); i, j = i+i64Size, j+1 {
        i64[j] = int64(binary.BigEndian.Uint64(b[i-i64Size : i]))
    }

    return i64, nil
}

The above is the detailed content of What is the most efficient way to read a byte file into an int64 slice?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete