


I am trying to query about 10,000 rows from a table. After trying various other options involving limit
offset
and not finding the desired success, I tried querying each goroutine for a single row. The idea is that each row only takes ~5ms
to query and fetch, but a batch of 10k will take over 20s
.
Shown below is a simplified version of the code:
func queryEmp(IDs[]string, spannerClient *spanner.Client) (Employee,error){ query := "Select name from Employee Where id = @id" g, gCtx := errgroup.WithContext(ctx) for _, ID := range IDs { id := ID g.Go(func() error { tx := spannerClient.Single() defer tx.Close() stmt2 := spanner.NewStatement(query) stmt2.Params = map[string]interface{}{ "ID": id, } qstart := time.Now() it := tx.Query(gCtx, stmt2) defer it.Stop() logrus.Debugf("%s took %v \n", "query execution.", time.Since(qstart)) for { row, err := it.Next() if err == iterator.Done { break } if err != nil { return err } var eID string if err := row.Column(0, &eID); err != nil { return err } } return nil }) } err = g.Wait() }
The results are tracked as follows:
{"message":"query execution. took 39.677µs \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 34.125µs \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 26.634µs \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 29.303µs \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} ... ... ... {"message":"query execution. took 188.749562ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 276.424692ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 188.62849ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 217.067524ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 276.949166ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} ... ... ... {"message":"query execution. took 454.64281ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 452.0848ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 525.748738ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 454.704656ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} {"message":"query execution. took 455.4276ms \n","severity":"debug","time":"2023-11-03T20:51:29-04:00"} ... ... ... {"message":"query execution. took 6.767574136s \n","severity":"debug","time":"2023-11-03T20:52:00-04:00"} {"message":"query execution. took 6.780578444s \n","severity":"debug","time":"2023-11-03T20:52:00-04:00"} {"message":"query execution. took 6.785085491s \n","severity":"debug","time":"2023-11-03T20:52:00-04:00"} {"message":"query execution. took 6.779527006s \n","severity":"debug","time":"2023-11-03T20:52:00-04:00"}
It starts fine and as expected, but the query time keeps increasing.
MaxSessions
and MinSessions
for spannerClient
are 100
so one would imagine it might see a slight slowdown after 100, but it is not the truth.
Please read here:
<code> Sessions can execute only one transaction at a time. Standalone reads, writes, and queries use a transaction internally, and count toward the one transaction limit. </code>
Non-iterative queries (ReadRow
etc.) give me the same results.
Using tx := spannerClient.Single()
outside a for loop will give similar results.
question:
- Does this mean that despite executing
spannerClient.Single()
in the goroutine, the goroutine is still trying to use the same session/transaction? - How to modify the above content to solve this problem?
Correct Answer
TLDR: The default maximum session pool size is 400, which means that no more than 400 queries can ever be run in parallel. You need to increase the session pool size to achieve this concurrency.
First of all: I don't think sending 10,000 queries in parallel so that each query reads one row is not the most efficient solution to your problem. If you have no other criteria than filtering by employee IDs, and those IDs are scattered all over the place, it would still be more efficient to create the query in the form
select * from employees where id in unnest(@ids)
For a complete example, see this comment: https:// /github.com/googleapis/google-cloud-go/issues/858#issuecomment-550982307
Back to your specific question:
- You're not actually measuring the time it takes to execute the query. This is a little confusing, but the line
it := tx.Query(gCtx, stmt2)
does not execute the query, it just prepares it to be executed. Executed whenrow, err := it.Next()
is called for the first time. You can also see this in the recorded execution times. The first statement seems to execute within 30 microseconds, which is impossible. - This means that something on your client is limiting your progress, in this case I'm pretty sure it's the maximum size of your session pool. The default maximum session pool size is 400. This means that up to 400 queries can be run in parallel. The increasing wait time you see is because the goroutine is placed in a wait queue waiting for the session to become available. Goroutines at the end of the queue will wait longer.
The above is the detailed content of Why does Spanner query using ReadOnlyTransaction in golang goroutine gradually become slower?. For more information, please follow other related articles on the PHP Chinese website!

The article explains how to use the pprof tool for analyzing Go performance, including enabling profiling, collecting data, and identifying common bottlenecks like CPU and memory issues.Character count: 159

The article discusses writing unit tests in Go, covering best practices, mocking techniques, and tools for efficient test management.

This article demonstrates creating mocks and stubs in Go for unit testing. It emphasizes using interfaces, provides examples of mock implementations, and discusses best practices like keeping mocks focused and using assertion libraries. The articl

This article explores Go's custom type constraints for generics. It details how interfaces define minimum type requirements for generic functions, improving type safety and code reusability. The article also discusses limitations and best practices

The article discusses Go's reflect package, used for runtime manipulation of code, beneficial for serialization, generic programming, and more. It warns of performance costs like slower execution and higher memory use, advising judicious use and best

This article explores using tracing tools to analyze Go application execution flow. It discusses manual and automatic instrumentation techniques, comparing tools like Jaeger, Zipkin, and OpenTelemetry, and highlighting effective data visualization

The article discusses using table-driven tests in Go, a method that uses a table of test cases to test functions with multiple inputs and outcomes. It highlights benefits like improved readability, reduced duplication, scalability, consistency, and a

The article discusses managing Go module dependencies via go.mod, covering specification, updates, and conflict resolution. It emphasizes best practices like semantic versioning and regular updates.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version
Useful JavaScript development tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software
