Home  >  Article  >  Backend Development  >  Concurrent mapping with slices in golang

Concurrent mapping with slices in golang

WBOY
WBOYforward
2024-02-11 09:57:091136browse

golang 中带有切片的并发映射

php editor Banana brought a wonderful article about the concurrent mapping of slices in golang. In this article, we will look at how to use slices for mapping operations in a concurrent environment and explain why slices are very useful in concurrent programming. By using golang's concurrency mechanism, we can access and modify slices in multiple goroutines at the same time, thereby improving the performance and efficiency of the program. Whether you are a beginner or an experienced golang developer, this article will bring you valuable knowledge and practical skills. Let's explore concurrent mapping with slices in golang!

Question content

I've been trying to solve a concurrency issue after one of the developers in the field left a few months ago, but I can't find a proper way to solve this problem.

For context, we load the customer data into a structure like this:

[ key ] -> { value }

[Customer Specific Hash] -> {Data Point/File Slice}

Example - really bad formatting, sorry:

[a60d849ad97bfb833e1096941] 
-> 
{ 
 { StartDate: '01-02-2022', EndDate: '28-02-2022', DataFrames: [1598,921578,12981,21749,192578...]},
 { StartDate: '01-03-2022', EndDate: '28-03-2022', DataFrames: [1234,1567,6781,126978...]},
}

The above is because we have 100,000 customers and every night we start a process that consolidates the data based on each customer's hash (or actually a bucket). Before processing the dataframes, we iterate over the slices and "merge" the dataframes into one large dataframe that contains many legal/accounting rules.

It runs in a goroutine to index all data points as quickly as possible.

So the implementation is essentially a sync.Map[string, []DataFrame] But I noticed that while the map operation is protected, appending to the DataFrame slice is not. Each hash probably has around 20-30 file references in that slice per night.

There's a good chance that customer data has been merged incorrectly over the past two years and I'm tasked with fixing it. Before sync.map, they again used RWMutex with Map, but not slicing, which points to this article as a guide.

First of all, is the idea of ​​a Map containing slices an appropriate data structure?

I'm trying to create a RWMutex based tile handler but was wondering if the Map could have a chan DataFrame instead put in when indexing the customer files and then once complete merge them in a second step into an array (such as len(chanx)) will it be known?

I'm mainly from Java, so I might be confused with some terminology, so I'm sorry.

Solution

You have two different problems:

  1. Concurrency issues occurred when updating the map
  2. Concurrency issues when updating map entries

sync.Map will prevent 1, but not 2.

One way to solve this problem is:

sync.Map[string, *DFrame]

where

type DFrame struct {
  sync.RWMutex 
  Data []DataFrame
}

Once you get the entry from the map, you should Lock or RLock it and then use the data. This isn't limited to just appending slices. Even if you are only reading from a dataframe, you must RLock the structure.

So if you want to append a new dataframe:

df := &DFrame{}
entry,_:=m.LoadOrStore(key, df)
dfEntry:=entry.(*DFrame)
dfEntry.Lock()
dfEntry.Data=append(dfEntry.Data, newDataFrame)
dfEntry.Unlock()

The above is the detailed content of Concurrent mapping with slices in golang. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete