Optimizing Memory Usage in Go: Mastering Data Structure Alignment-Golang-php.cn

Home

Backend Development

Golang

Optimizing Memory Usage in Go: Mastering Data Structure Alignment

Barbara Streisand

Nov 16, 2024 am 09:54 AM

Memory optimization is crucial for writing performant software systems. When a software has a finite amount of memory to work with, many issues can arise when that memory isn't used efficiently. That's why memory optimization is critical for better overall performance.

Go inherits many of C's advantageous features, but what I notice is that a large part of people who use it do not know the full power of this language. One of the reasons may be a lack of knowledge about how it works at a low level, or a lack of experience with languages like C or C . I mention C and C because the foundations of Go are pretty much built on the wonderful features of C/C . It's no coincidence that I will quote an interview of Ken Thompson at Google I/O 2012:

For me, the reason I was enthusiastic about Go is because just about the same time that we were starting on Go, I read (or tried to read) the C 0x proposed standard, and that was a convincer for me.

Today, we're going to talk about how we can optimize our Go program, and more specifically, how it would be good to use structs in Go. Let's first say what a structure is:

A struct is a user-defined data type that groups related variables of different types under a single name.

To fully understand where the problem lies, we will mention that modern processors do not read 1 byte at a time from the memory. How the CPU fetches the data or instructions which are stored in the memory?

In computer architecture, a word is a unit of data that a processor can handle in a single operation - generally the smallest addressable unit of memory. It's a fixed-size group of bits (binary digits). The word size of a processor determines its ability to handle data efficiently. Common word sizes include 8, 16, 32, and 64 bits. Some computer processor architectures support a half word, which is half the number of bits in a word, and a double word, which is two contiguous words.

In nowday the most common architectures is 32 bit and 64 bit. If you have 32 bit processor then it means it can access 4 bytes at a time which means word size is 4 bytes. If you have 64 bit processor the it can access 8 bytes at time which means word size is 8 bytes.

When we store the data in memory, each 32-bit data word has a unique address, as shown below.

Optimizing Memory Usage in Go: Mastering Data Structure Alignment

Figure. 1 ‑ Word-Addressable Memory

We can read the data in memory and load it to one register using the load word (lw) instruction.

After know the above theoria let's see what is the practise. For the descripte the cases with structure data structure, I will demonstrate with C languge. A struct in C is a composite data type that allows you to group multiple variables together and store them in the same block of memory. As we said early the CPU access the data depend of the givemn architecture. Every data type in C will have alignment requirements.

So let's we have the following as simple structure:

// structure 1
typedef struct example_1 {
    char c;
    short int s;
} struct1_t;


// structure 2
typedef struct example_2 {
    double d;
    int s;
    char c;
} struct2_t;

And now try to calculate the size of the following structures:

Size of Structure 1 = Size of (char short int) = 1 2 = 3.

Size of Structure 2 = Size of (double int char) = 8 4 1= 13.

The real sizes using a C program might surprise you.

#include <stdio.h>


// structure 1
typedef struct example_1 {
    char c;
    short int s;
} struct1_t;

// structure 2
typedef struct example_2 {
    double d;
    int s;
    char c;
} struct2_t;

int main()
{
    printf("sizeof(struct1_t) = %lu\n", sizeof(struct1_t));
    printf("sizeof(struct2_t) = %lu\n", sizeof(struct2_t));

    return 0;
}
</stdio.h>

Output

sizeof(struct1_t) = 4
sizeof(struct2_t) = 16

As we can see, the size of the structures is different from those we calculated.

What is the reason for that?

C and Go employ a technique known as "struct padding" to ensure data is appropriately aligned in memory, which can significantly affect performance due to hardware and architectural constraints. Data padding and alignment conform to the requirements of the system's architecture, mainly to optimize CPU access times by ensuring data boundaries align with word sizes.

Let's go through an example to illustrate how Go handles padding and alignment, consider the following struct:

type Employee struct {
  IsAdmin  bool
  Id       int64
  Age      int32
  Salary   float32
}

A bool is 1 byte, int64 is 8 bytes, int32 is 4 bytes and float32 is 4 bytes = 17 bytes(total).

Let's validate the struct size by examining the compiled Go program:

package main

import (
    "fmt"
    "unsafe"
)

type Employee struct {
    IsAdmin bool
    Id      int64
    Age     int32
    Salary  float32
}

func main() {

    var emp Employee

    fmt.Printf("Size of Employee: %d\n", unsafe.Sizeof(emp))
}

Output

Size of Employee: 24

The reported size is 24 bytes, not 17. This discrepancy is due to memory alignment. To understand how alignment works, we need to inspect the structure and visualize the memory which it ocupate.

Optimizing Memory Usage in Go: Mastering Data Structure Alignment

Figure 2 - Unoptimized Memory Layout

The struct Employee will consume 8*3 = 24 bytes. Yo see the problem now, there are a lot of empty holes in the layout of Employee (those gaps created by the alignment rules are called “padding”).

Padding Optimization and Performance Impact

Understanding how memory alignment and padding can affect the performance of an application is crucial. Specifically, data alignment impacts the number of CPU cycles required to access fields within a struct. This influence arises mostly from CPU cache effects, rather than raw clock cycles themselves, as cache behavior depends heavily on data locality and alignment within memory blocks.

Modern CPUs fetch data from memory into a faster intermediary called cache, organized in fixed-size blocks (commonly 64 bytes). When data is well-aligned and localized within the same or fewer cache lines, the CPU can access it more quickly due to reduced cache loading operations.

Consider the following Go structures to illustrate poor versus optimal alignment:

// structure 1
typedef struct example_1 {
    char c;
    short int s;
} struct1_t;


// structure 2
typedef struct example_2 {
    double d;
    int s;
    char c;
} struct2_t;

How Alignment Affects Performance

CPU reads data in word-size instead of byte-size. As I described in the begining a word in a 64-bit system is 8 bytes, while a word in a 32-bit system is 4 bytes. In short, CPU reads address in the multiple of its word size. To fetch the variable passportId, our CPU takes two cycles to access the data instead of one. The first cycle will fetch memory 0 to 7 and the subsequent cycle will fetch the rest. And this is inefficient- we need of data structure alignment.By simply aligning the data, computers ensure that the var passportId can be retrieved in ONE CPU cycle.

Optimizing Memory Usage in Go: Mastering Data Structure Alignment

Figure 3 - Comparing Memory Access Efficiency

Padding is the key to achieving data alignment. Padding occurs because modern CPUs are optimized to read data from memory at aligned addresses. This alignment allows the CPU to read the data in a single operation.

Optimizing Memory Usage in Go: Mastering Data Structure Alignment

Figure 4 - Simply aligning the data

Without padding, data may be misaligned, leading to multiple memory accesses and slower performance. Therefore, while padding might waste some memory, it ensures that your program runs efficiently.

Padding Optimization Strategies

Aligned struct consumes lesser memory simply because it possesses a better struct fields order compared to Misaligned. Because of padding, two 13 bytes data structures turn out to be 16 bytes and 24 bytes respectively. Hence, you save up extra memory by simply reordering your struct fields.

Optimizing Memory Usage in Go: Mastering Data Structure Alignment

Figure 5 - Optimizing Field Order

Improperly aligned data can slow performance as the CPU might need multiple cycles to access misaligned fields. Conversely, correctly aligned data minimizes cache line loads, which is crucial for performance, especially in systems where memory speed is a bottleneck.

Let’s do a simple benchmark to prove it:

#include <stdio.h>


// structure 1
typedef struct example_1 {
    char c;
    short int s;
} struct1_t;

// structure 2
typedef struct example_2 {
    double d;
    int s;
    char c;
} struct2_t;

int main()
{
    printf("sizeof(struct1_t) = %lu\n", sizeof(struct1_t));
    printf("sizeof(struct2_t) = %lu\n", sizeof(struct2_t));

    return 0;
}
</stdio.h>

Output

sizeof(struct1_t) = 4
sizeof(struct2_t) = 16

As you can see the traversing the Aligned indeed takes lesser time than its counterpart.

Padding’s added to make sure each struct field lines up properly in memory based on its needs, like we saw earlier. But while it enables efficient access, padding can also waste space if the fields ain’t ordered well.

Understanding how to properly align struct fields to minimize memory waste due to padding is important for efficient memory usage, especially in performance-critical applications. Below, I will provide an example with a poorly aligned structure and then show an optimized version of the same structure.

In a poorly aligned struct, fields are ordered without considering their sizes and alignment requirements, which can lead to added padding and increased memory usage:

// structure 1
typedef struct example_1 {
    char c;
    short int s;
} struct1_t;


// structure 2
typedef struct example_2 {
    double d;
    int s;
    char c;
} struct2_t;

The total memory could hence be 1 (bool) 7 (padding) 8 (float64) 4 (int32) 4 (padding) 16 (string) = 40 bytes.

An optimized structure arranges fields from largest to smallest size, significantly reducing or eliminating the need for additional padding:

#include <stdio.h>


// structure 1
typedef struct example_1 {
    char c;
    short int s;
} struct1_t;

// structure 2
typedef struct example_2 {
    double d;
    int s;
    char c;
} struct2_t;

int main()
{
    printf("sizeof(struct1_t) = %lu\n", sizeof(struct1_t));
    printf("sizeof(struct2_t) = %lu\n", sizeof(struct2_t));

    return 0;
}
</stdio.h>

The total memory would then neatly comprise 8 (float64) 16 (string) 4 (int32) 1 (bool) 3 (padding) = 32 bytes.

Let's proof the above:

sizeof(struct1_t) = 4
sizeof(struct2_t) = 16

Output

type Employee struct {
  IsAdmin  bool
  Id       int64
  Age      int32
  Salary   float32
}

Reducing the structure size from 40 bytes to 32 bytes means a 20% reduction in memory usage per instance of Person. This can lead to considerable savings in applications where many such instances are created or stored, improving cache efficiency and potentially reducing the number of cache misses.

Conclusion

Data alignment is a pivotal factor in optimizing memory utilization and enhancing system performance. By arranging struct data correctly, memory usage becomes not only more efficient but also faster in terms of CPU read times, contributing significantly to overall system efficiency.

The above is the detailed content of Optimizing Memory Usage in Go: Mastering Data Structure Alignment. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Learn Go String Manipulation: Working with the 'strings' PackageMay 09, 2025 am 12:07 AM

Go's "strings" package provides rich features to make string operation efficient and simple. 1) Use strings.Contains() to check substrings. 2) strings.Split() can be used to parse data, but it should be used with caution to avoid performance problems. 3) strings.Join() is suitable for formatting strings, but for small datasets, looping = is more efficient. 4) For large strings, it is more efficient to build strings using strings.Builder.

Go: String Manipulation with the Standard 'strings' PackageMay 09, 2025 am 12:07 AM

Go uses the "strings" package for string operations. 1) Use strings.Join function to splice strings. 2) Use the strings.Contains function to find substrings. 3) Use the strings.Replace function to replace strings. These functions are efficient and easy to use and are suitable for various string processing tasks.

Mastering Byte Slice Manipulation with Go's 'bytes' Package: A Practical GuideMay 09, 2025 am 12:02 AM

ThebytespackageinGoisessentialforefficientbyteslicemanipulation,offeringfunctionslikeContains,Index,andReplaceforsearchingandmodifyingbinarydata.Itenhancesperformanceandcodereadability,makingitavitaltoolforhandlingbinarydata,networkprotocols,andfileI

Learn Go Binary Encoding/Decoding: Working with the 'encoding/binary' PackageMay 08, 2025 am 12:13 AM

Go uses the "encoding/binary" package for binary encoding and decoding. 1) This package provides binary.Write and binary.Read functions for writing and reading data. 2) Pay attention to choosing the correct endian (such as BigEndian or LittleEndian). 3) Data alignment and error handling are also key to ensure the correctness and performance of the data.

Go: Byte Slice Manipulation with the Standard 'bytes' PackageMay 08, 2025 am 12:09 AM

The"bytes"packageinGooffersefficientfunctionsformanipulatingbyteslices.1)Usebytes.Joinforconcatenatingslices,2)bytes.Bufferforincrementalwriting,3)bytes.Indexorbytes.IndexByteforsearching,4)bytes.Readerforreadinginchunks,and5)bytes.SplitNor

Go encoding/binary package: Optimizing performance for binary operationsMay 08, 2025 am 12:06 AM

Theencoding/binarypackageinGoiseffectiveforoptimizingbinaryoperationsduetoitssupportforendiannessandefficientdatahandling.Toenhanceperformance:1)Usebinary.NativeEndianfornativeendiannesstoavoidbyteswapping.2)BatchReadandWriteoperationstoreduceI/Oover

Go bytes package: short reference and tipsMay 08, 2025 am 12:05 AM

Go's bytes package is mainly used to efficiently process byte slices. 1) Using bytes.Buffer can efficiently perform string splicing to avoid unnecessary memory allocation. 2) The bytes.Equal function is used to quickly compare byte slices. 3) The bytes.Index, bytes.Split and bytes.ReplaceAll functions can be used to search and manipulate byte slices, but performance issues need to be paid attention to.

Go bytes package: practical examples for byte slice manipulationMay 08, 2025 am 12:01 AM

The byte package provides a variety of functions to efficiently process byte slices. 1) Use bytes.Contains to check the byte sequence. 2) Use bytes.Split to split byte slices. 3) Replace the byte sequence bytes.Replace. 4) Use bytes.Join to connect multiple byte slices. 5) Use bytes.Buffer to build data. 6) Combined bytes.Map for error processing and data verification.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Blue Prince: How To Get To The Basement

1 months agoByDDD

Hot Tools

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux latest version

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 English version

Recommended: Win version, supports code prompts!

Hot Topics

1664

1423

1318

1269

1248