Here is another code review interview question for you. This question is more advanced than the previous ones and is targeted toward a more senior audience. The problem requires knowledge of slices and sharing data between parallel processes.
If you're not familiar with the slices and how they are constructed, please check out my previous article about the Slice Header
What is a Data Race?
A data race occurs when two or more threads (or goroutines, in the case of Go) concurrently access shared memory, and at least one of those accesses is a write operation. If there are no proper synchronization mechanisms (such as locks or channels) in place to manage access, the result can be unpredictable behavior, including corruption of data, inconsistent states, or crashes.
In essence, a data race happens when:
- Two or more threads (or goroutines) access the same memory location at the same time.
- At least one of the threads (or goroutines) is writing to that memory.
- There is no synchronization to control the access to that memory.
Because of this, the order in which the threads or goroutines access or modify the shared memory is unpredictable, leading to non-deterministic behavior that can vary between runs.
+----------------------+ +---------------------+ | Thread A: Write | | Thread B: Read | +----------------------+ +---------------------+ | 1. Reads x | | 1. Reads x | | 2. Adds 1 to x | | | | 3. Writes new value | | | +----------------------+ +---------------------+ Shared variable x (Concurrent access without synchronization)
Here, Thread A is modifying x (writing to it), while Thread B is reading it at the same time. If both threads are running concurrently and there’s no synchronization, Thread B could read x before Thread A has finished updating it. As a result, the data could be incorrect or inconsistent.
Question: One of your teammates submitted the following code for a code review. Please review the code carefully and identify any potential issues.
And here the code that you have to review:
package main import ( "bufio" "bytes" "io" "math/rand" "time" ) func genData() []byte { r := rand.New(rand.NewSource(time.Now().Unix())) buffer := make([]byte, 512) if _, err := r.Read(buffer); err != nil { return nil } return buffer } func publish(input []byte, output chan <p>What we have here? </p> <p>The publish() function is responsible for reading the input data chunk by chunk and sending each chunk to the output channel. It begins by using bytes.NewReader(input) to create a reader from the input data, which allows the data to be read sequentially. A buffer of size 8 is created to hold each chunk of data as it’s being read from the input. During each iteration, reader.Read(buffer) reads up to 8 bytes from the input, and the function then sends a slice of this buffer (buffer[:n]) containing up to 8 bytes to the output channel. The loop continues until reader.Read(buffer) either encounters an error or reaches the end of the input data.</p> <p>The consume() function handles the data chunks received from the channel. It processes these chunks using a bufio.Scanner, which scans each chunk of data, potentially breaking it into lines or tokens depending on how it’s configured. The variable b := scanner.Bytes() retrieves the current token being scanned. This function represents a basic input processing. </p> <p>The main() creates a buffered channel chunkChannel with a capacity equal to workersCount, which is set to 4 in this case. The function then launches 4 worker goroutines, each of which will read data from the chunkChannel concurrently. Every time a worker receives a chunk of data, it processes the chunk by calling the consume() function. The publish() function reads the generated data, breaks it into chunks of up to 8 bytes, and sends them to the channel.</p> <p>The program uses goroutines to create multiple consumers, allowing for concurrent data processing. Each consumer runs in a separate goroutine, processing chunks of data independently.</p> <p>If you run this code, noting suspicious will happen:<br> </p> <pre class="brush:php;toolbar:false">[Running] go run "main.go" [Done] exited with code=0 in 0.94 seconds
But there is a problem. We have a Data Race Risk. In this code, there’s a potential data race because the publish() function reuses the same buffer slice for each chunk. The consumers are reading from this buffer concurrently, and since slices share underlying memory, multiple consumers could be reading the same memory, leading to a data race. Let's try to use a race detection. Go provides a built-in tool to detect data races: the race detector. You can enable it by running your program with the -race flag:
go run -race main.go
If we add the -race flag to the run command we will receive the following output:
[Running] go run -race "main.go" ================== WARNING: DATA RACE Read at 0x00c00011e018 by goroutine 6: runtime.slicecopy() /GOROOT/go1.22.0/src/runtime/slice.go:325 +0x0 bytes.(*Reader).Read() /GOROOT/go1.22.0/src/bytes/reader.go:44 +0xcc bufio.(*Scanner).Scan() /GOROOT/go1.22.0/src/bufio/scan.go:219 +0xef4 main.consume() /GOPATH/example/main.go:40 +0x140 main.main.func1() /GOPATH/example/main.go:55 +0x48 Previous write at 0x00c00011e018 by main goroutine: runtime.slicecopy() /GOROOT/go1.22.0/src/runtime/slice.go:325 +0x0 bytes.(*Reader).Read() /GOROOT/go1.22.0/src/bytes/reader.go:44 +0x168 main.publish() /GOPATH/example/main.go:27 +0xe4 main.main() /GOPATH/example/main.go:60 +0xdc Goroutine 6 (running) created at: main.main() /GOPATH/example/main.go:53 +0x50 ================== Found 1 data race(s) exit status 66 [Done] exited with code=0 in 0.94 seconds
The warning you’re seeing is a classic data race detected by Go’s race detector. The warning message indicates that two goroutines are accessing the same memory location (0x00c00011e018) concurrently. One goroutine is reading from this memory, while another goroutine is writing to it at the same time, without proper synchronization.
The first part of the warning tells us that Goroutine 6 (which is one of the worker goroutines in your program) is reading from the memory address 0x00c00011e018 during a call to bufio.Scanner.Scan() inside the consume() function.
Read at 0x00c00011e018 by goroutine 6: runtime.slicecopy() /GOROOT/go1.22.0/src/runtime/slice.go:325 +0x0 bytes.(*Reader).Read() /GOROOT/go1.22.0/src/bytes/reader.go:44 +0xcc bufio.(*Scanner).Scan() /GOROOT/go1.22.0/src/bufio/scan.go:219 +0xef4 main.consume() /GOPATH/example/main.go:40 +0x140 main.main.func1() /GOPATH/example/main.go:55 +0x48
The second part of the warning shows that the main goroutine previously wrote to the same memory location (0x00c00011e018) during a call to bytes.Reader.Read() inside the publish() function.
Previous write at 0x00c00011e018 by main goroutine: runtime.slicecopy() /GOROOT/go1.22.0/src/runtime/slice.go:325 +0x0 bytes.(*Reader).Read() /GOROOT/go1.22.0/src/bytes/reader.go:44 +0x168 main.publish() /GOPATH/example/main.go:27 +0xe4 main.main() /GOPATH/example/main.go:60 +0xdc
The final part of the warning explains that Goroutine 6 was created in the main function.
Goroutine 6 (running) created at: main.main() /GOPATH/example/main.go:53 +0x50
In this case, while one goroutine (Goroutine 6) is reading from the buffer in consume(), the publish() function in the main goroutine is simultaneously writing to the same buffer, leading to the data race.
+-------------------+ +--------------------+ | Publisher | | Consumer | +-------------------+ +--------------------+ | | v | 1. Read data into buffer | | | v | 2. Send slice of buffer to chunkChannel | | | v | +----------------+ | | chunkChannel | | +----------------+ | | | v | 3. Consume reads from slice | v 4. Concurrent access (Data Race occurs)
Why the Data Race Occurs
The data race in this code arises because of how Go slices work and how memory is shared between goroutines when a slice is reused. To fully understand this, let’s break it down into two parts: the behavior of the buffer slice and the mechanics of how the race occurs. When you pass a slice like buffer[:n] to a function or channel, what you are really passing is the slice header which contains a reference to the slice’s underlying array. Any modifications to the slice or the underlying array will affect all other references to that slice.
buffer = [ a, b, c, d, e, f, g, h ] <pre class="brush:php;toolbar:false">func publish(input []byte, output chan <p>If you send buffer[:n] to a channel, both the publish() function and any consumer goroutines will be accessing the same memory. During each iteration, the reader.Read(buffer) function reads up to 8 bytes from the input data into this buffer slice. After reading, the publisher sends buffer[:n] to the output channel, where n is the number of bytes read in the current iteration. </p> <p>The problem here is that <strong>buffer</strong> <strong>is reused across iterations</strong>. Every time reader.Read() is called, <strong>it overwrites the data stored in buffer.</strong> </p>
- Iteration 1: publish() function reads the first 8 bytes into buffer and sends buffer[:n] (say, [a, b, c, d, e, f, g, h]) to the channel.
- Iteration 2: The publish() function overwrites the buffer with the next 8 bytes, let’s say [i, j, k, l, m, n, o, p], and sends buffer[:n] again.
At this point, if one of the worker goroutines is still processing the first chunk, it is now reading stale or corrupted data because the buffer has been overwritten by the second chunk. Reusing a slice neans sharing the same memory.
How to fix the Data Race?
To avoid the race condition, we must ensure that each chunk of data sent to the channel has its own independent memory. This can be achieved by creating a new slice for each chunk and copying the data from the buffer to this new slice. The key fix is to copy the contents of the buffer into a new slice before sending it to the chunkChannel:
chunk := make([]byte, n) // Step 1: Create a new slice with its own memory copy(chunk, buffer[:n]) // Step 2: Copy data from buffer to the new slice output <p>Why this fix works? By creating a new slice (chunk) for each iteration, you ensure that each chunk has its own memory. This prevents the consumers from reading from the buffer that the publisher is still modifying. copy() function copies the contents of the buffer into the newly allocated slice (chunk). This decouples the memory used by each chunk from the buffer. Now, when the publisher reads new data into the buffer, it doesn’t affect the chunks that have already been sent to the channel.<br> </p> <pre class="brush:php;toolbar:false">+-------------------------+ +------------------------+ | Publisher (New Memory) | | Consumers (Read Copy) | | [ a, b, c ] --> chunk1 | | Reading: chunk1 | | [ d, e, f ] --> chunk2 | | Reading: chunk2 | +-------------------------+ +------------------------+ ↑ ↑ (1) (2) Publisher Creates New Chunk Consumers Read Safely
This solution works is that it breaks the connection between the publisher and the consumers by eliminating shared memory. Each consumer now works on its own copy of the data, which the publisher does not modify. Here’s how the modified publish() function looks:
func publish(input []byte, output chan <h3> Summary </h3> <p><strong>Slices Are Reference Types:</strong><br> As mentioned earlier, Go slices are reference types, meaning they point to an underlying array. When you pass a slice to a channel or a function, you’re passing a reference to that array, not the data itself. This is why reusing a slice leads to a data race: multiple goroutines end up referencing and modifying the same memory.</p><p><strong>Memory Allocation:</strong><br> When we create a new slice with make([]byte, n), Go allocates a separate block of memory for that slice. This means the new slice (chunk) has its own backing array, independent of the buffer. By copying the data from buffer[:n] into chunk, we ensure that each chunk has its own private memory space.</p> <p><strong>Decoupling Memory:</strong><br> By decoupling the memory of each chunk from the buffer, the publisher can continue to read new data into the buffer without affecting the chunks that have already been sent to the channel. Each chunk now has its own independent copy of the data, so the consumers can process the chunks without interference from the publisher.</p> <p><strong>Preventing Data Races:</strong><br> The main source of the data race was the concurrent access to the shared buffer. By creating new slices and copying the data, we eliminate the shared memory, and each goroutine operates on its own data. This removes the possibility of a race condition because there’s no longer any contention over the same memory.</p> <h3> Conclusion </h3> <p>The core of the fix is simple but powerful: by ensuring that each chunk of data has its own memory, we eliminate the shared resource (the buffer) that was causing the data race. This is achieved by copying the data from the buffer into a new slice before sending it to the channel. With this approach, each consumer works on its own copy of the data, independent of the publisher’s actions, ensuring safe concurrent processing without race conditions. This method of decoupling shared memory is a fundamental strategy in concurrent programming. It prevents the unpredictable behavior caused by race conditions and ensures that your Go programs remain safe, predictable, and correct, even when multiple goroutines are accessing data concurrently.</p> <p>It's that easy!</p>
The above is the detailed content of Tricky Golang interview questions - Part Data Race. For more information, please follow other related articles on the PHP Chinese website!

Golang is more suitable for high concurrency tasks, while Python has more advantages in flexibility. 1.Golang efficiently handles concurrency through goroutine and channel. 2. Python relies on threading and asyncio, which is affected by GIL, but provides multiple concurrency methods. The choice should be based on specific needs.

The performance differences between Golang and C are mainly reflected in memory management, compilation optimization and runtime efficiency. 1) Golang's garbage collection mechanism is convenient but may affect performance, 2) C's manual memory management and compiler optimization are more efficient in recursive computing.

ChooseGolangforhighperformanceandconcurrency,idealforbackendservicesandnetworkprogramming;selectPythonforrapiddevelopment,datascience,andmachinelearningduetoitsversatilityandextensivelibraries.

Golang and Python each have their own advantages: Golang is suitable for high performance and concurrent programming, while Python is suitable for data science and web development. Golang is known for its concurrency model and efficient performance, while Python is known for its concise syntax and rich library ecosystem.

In what aspects are Golang and Python easier to use and have a smoother learning curve? Golang is more suitable for high concurrency and high performance needs, and the learning curve is relatively gentle for developers with C language background. Python is more suitable for data science and rapid prototyping, and the learning curve is very smooth for beginners.

Golang and C each have their own advantages in performance competitions: 1) Golang is suitable for high concurrency and rapid development, and 2) C provides higher performance and fine-grained control. The selection should be based on project requirements and team technology stack.

Golang is suitable for rapid development and concurrent programming, while C is more suitable for projects that require extreme performance and underlying control. 1) Golang's concurrency model simplifies concurrency programming through goroutine and channel. 2) C's template programming provides generic code and performance optimization. 3) Golang's garbage collection is convenient but may affect performance. C's memory management is complex but the control is fine.

Goimpactsdevelopmentpositivelythroughspeed,efficiency,andsimplicity.1)Speed:Gocompilesquicklyandrunsefficiently,idealforlargeprojects.2)Efficiency:Itscomprehensivestandardlibraryreducesexternaldependencies,enhancingdevelopmentefficiency.3)Simplicity:


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

WebStorm Mac version
Useful JavaScript development tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 English version
Recommended: Win version, supports code prompts!

Zend Studio 13.0.1
Powerful PHP integrated development environment