Concurrency has become an essential feature in modern programming languages. Most programming languages now have some method to achieve concurrency.
Some of these implementations are very powerful and can shift the load to different system threads, such as Java, etc.; some simulate this behavior on the same thread, such as Ruby, etc.
Golang’s concurrency model is very powerful, called CSP (Communicating Sequential Process), which breaks a problem into smaller sequential processes and then schedules instances of these processes (called Goroutines). These processes communicate by passing information through channels.
In this article, we will explore how to take advantage of golang's concurrency and how to use it in workerPool. In the second article in the series, we will explore how to build a powerful concurrency solution.
A simple example
Suppose we need to call an external API interface, and the whole process takes 100ms. If we need to call this interface 1000 times synchronously, it will take 100s.
//// model/data.go package model type SimpleData struct { ID int } //// basic/basic.go package basic import ( "fmt" "github.com/Joker666/goworkerpool/model" "time" ) func Work(allData []model.SimpleData) { start := time.Now() for i, _ := range allData { Process(allData[i]) } elapsed := time.Since(start) fmt.Printf("Took ===============> %s\n", elapsed) } func Process(data model.SimpleData) { fmt.Printf("Start processing %d\n", data.ID) time.Sleep(100 * time.Millisecond) fmt.Printf("Finish processing %d\n", data.ID) } //// main.go package main import ( "fmt" "github.com/Joker666/goworkerpool/basic" "github.com/Joker666/goworkerpool/model" "github.com/Joker666/goworkerpool/worker" ) func main() { // Prepare the data var allData []model.SimpleData for i := 0; i < 1000; i++ { data := model.SimpleData{ ID: i } allData = append(allData, data) } fmt.Printf("Start processing all work \n") // Process basic.Work(allData) }
Start processing all work Took ===============> 1m40.226679665s
The above code creates the model package, which contains a structure that has only one member of type int. We process the data synchronously, which is obviously not optimal since these tasks can be processed concurrently. Let's change the solution and use goroutine and channel to handle it.
Asynchronous
//// worker/notPooled.go func NotPooledWork(allData []model.SimpleData) { start := time.Now() var wg sync.WaitGroup dataCh := make(chan model.SimpleData, 100) wg.Add(1) go func() { defer wg.Done() for data := range dataCh { wg.Add(1) go func(data model.SimpleData) { defer wg.Done() basic.Process(data) }(data) } }() for i, _ := range allData { dataCh <- allData[i] } close(dataCh) wg.Wait() elapsed := time.Since(start) fmt.Printf("Took ===============> %s\n", elapsed) } //// main.go // Process worker.NotPooledWork(allData)
Start processing all work Took ===============> 101.191534ms
In the above code, we created a cache channel with a capacity of 100 and pushed the data into the channel through NoPooledWork() . After the channel length reaches 100, we cannot add elements to it until an element is read. Use for range to read the channel and generate goroutine processing. Here we have no limit on the number of goroutines generated, which can handle as many tasks as possible. In theory, as much data as possible can be processed given the required resources. Executing the code, it only took 100ms to complete 1000 tasks. It’s crazy! Not entirely, read on.
Question
Unless we own all the resources on earth, there is a limit to how much we can allocate at a given time. The minimum memory occupied by a goroutine is 2k, but it can also reach 1G. The above solution of executing all tasks concurrently, assuming a million tasks, will quickly exhaust the machine's memory and CPU. We either need to upgrade the configuration of the machine or find other better solutions.
计算机科学家很久之前就考虑过这个问题,并提出了出色的解决方案 - 使用 Thread Pool 或者 Worker Pool。这个方案是使用 worker 数量受限的工作池来处理任务,workers 会按顺序一个接一个处理任务,这样就避免了 CPU 和内存使用急速增长。
解决方案:Worker Pool
我们通过实现 worker pool 来修复之前遇到的问题。
//// worker/pooled.go func PooledWork(allData []model.SimpleData) { start := time.Now() var wg sync.WaitGroup workerPoolSize := 100 dataCh := make(chan model.SimpleData, workerPoolSize) for i := 0; i < workerPoolSize; i++ { wg.Add(1) go func() { defer wg.Done() for data := range dataCh { basic.Process(data) } }() } for i, _ := range allData { dataCh <- allData[i] } close(dataCh) wg.Wait() elapsed := time.Since(start) fmt.Printf("Took ===============> %s\n", elapsed) } //// main.go // Process worker.PooledWork(allData)
Start processing all work Took ===============> 1.002972449s
上面的代码,worker 数量限制在 100,我们创建了相应数量的 goroutine 来处理任务。我们可以把 channel 看作是队列,worker goroutine 看作是消费者。多个 goroutine 可以监听同一个 channel,但是 channel 里的每一个元素只会被处理一次。
Go 语言的 channel 可以当作队列使用。
这是一个比较好的解决方案,执行代码,我们看到完成所有任务花费 1s。虽然没有 100ms 这么快,但已经能满足业务需要,而且我们得到了一个更好的解决方案,能将负载均摊在不同的时间片上。
处理错误
我们能做的还没完。上面看起来是一个完整的解决方案,但却不是的,我们没有处理错误情况。所以需要模拟出错的情形,并且看下我们需要怎么处理。
//// worker/pooledError.go func PooledWorkError(allData []model.SimpleData) { start := time.Now() var wg sync.WaitGroup workerPoolSize := 100 dataCh := make(chan model.SimpleData, workerPoolSize) errors := make(chan error, 1000) for i := 0; i < workerPoolSize; i++ { wg.Add(1) go func() { defer wg.Done() for data := range dataCh { process(data, errors) } }() } for i, _ := range allData { dataCh <- allData[i] } close(dataCh) wg.Add(1) go func() { defer wg.Done() for { select { case err := <-errors: fmt.Println("finished with error:", err.Error()) case <-time.After(time.Second * 1): fmt.Println("Timeout: errors finished") return } } }() defer close(errors) wg.Wait() elapsed := time.Since(start) fmt.Printf("Took ===============> %s\n", elapsed) } func process(data model.SimpleData, errors chan<- error) { fmt.Printf("Start processing %d\n", data.ID) time.Sleep(100 * time.Millisecond) if data.ID % 29 == 0 { errors <- fmt.Errorf("error on job %v", data.ID) } else { fmt.Printf("Finish processing %d\n", data.ID) } } //// main.go // Process worker.PooledWorkError(allData)
我们修改了 process() 函数,处理一些随机的错误并将错误 push 到 errors chnanel 里。所以,为了处理并发出现的错误,我们可以使用 errors channel 保存错误数据。在所有任务处理完成之后,可以检查错误 channel 是否有数据。错误 channel 里的元素保存了任务 ID,方便需要的时候再处理这些任务。
比之前没处理错误,很明显这是一个更好的解决方案。但我们还可以做得更好,
我们将在下篇文章讨论如何编写一个强大的 worker pool 包,并且在 worker 数量受限的情况下处理并发任务。
总结
Go 语言的并发模型足够强大给力,只需要构建一个 worker pool 就能很好地解决问题而无需做太多工作,这就是它没有包含在标准库中的原因。但是,我们自己可以构建一个满足自身需求的方案。很快,我会在下一篇文章中讲到,敬请期待!
The above is the detailed content of Concurrency and WorkerPool in Go language - Part 1. For more information, please follow other related articles on the PHP Chinese website!

Golangisidealforbuildingscalablesystemsduetoitsefficiencyandconcurrency,whilePythonexcelsinquickscriptinganddataanalysisduetoitssimplicityandvastecosystem.Golang'sdesignencouragesclean,readablecodeanditsgoroutinesenableefficientconcurrentoperations,t

Golang is better than C in concurrency, while C is better than Golang in raw speed. 1) Golang achieves efficient concurrency through goroutine and channel, which is suitable for handling a large number of concurrent tasks. 2)C Through compiler optimization and standard library, it provides high performance close to hardware, suitable for applications that require extreme optimization.

Reasons for choosing Golang include: 1) high concurrency performance, 2) static type system, 3) garbage collection mechanism, 4) rich standard libraries and ecosystems, which make it an ideal choice for developing efficient and reliable software.

Golang is suitable for rapid development and concurrent scenarios, and C is suitable for scenarios where extreme performance and low-level control are required. 1) Golang improves performance through garbage collection and concurrency mechanisms, and is suitable for high-concurrency Web service development. 2) C achieves the ultimate performance through manual memory management and compiler optimization, and is suitable for embedded system development.

Golang performs better in compilation time and concurrent processing, while C has more advantages in running speed and memory management. 1.Golang has fast compilation speed and is suitable for rapid development. 2.C runs fast and is suitable for performance-critical applications. 3. Golang is simple and efficient in concurrent processing, suitable for concurrent programming. 4.C Manual memory management provides higher performance, but increases development complexity.

Golang's application in web services and system programming is mainly reflected in its simplicity, efficiency and concurrency. 1) In web services, Golang supports the creation of high-performance web applications and APIs through powerful HTTP libraries and concurrent processing capabilities. 2) In system programming, Golang uses features close to hardware and compatibility with C language to be suitable for operating system development and embedded systems.

Golang and C have their own advantages and disadvantages in performance comparison: 1. Golang is suitable for high concurrency and rapid development, but garbage collection may affect performance; 2.C provides higher performance and hardware control, but has high development complexity. When making a choice, you need to consider project requirements and team skills in a comprehensive way.

Golang is suitable for high-performance and concurrent programming scenarios, while Python is suitable for rapid development and data processing. 1.Golang emphasizes simplicity and efficiency, and is suitable for back-end services and microservices. 2. Python is known for its concise syntax and rich libraries, suitable for data science and machine learning.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

Dreamweaver Mac version
Visual web development tools