Home >Database >Mysql Tutorial >Why are logical reads so high when using windowed aggregate functions, especially with common subexpression spools?

Why are logical reads so high when using windowed aggregate functions, especially with common subexpression spools?

Linda Hamilton
Linda HamiltonOriginal
2024-12-26 18:47:15890browse

Why are logical reads so high when using windowed aggregate functions, especially with common subexpression spools?

Why are logical reads for windowed aggregate functions so high?

Windowed aggregate functions can result in high logical reads reported in execution plans with common subexpression spools, particularly for large tables. This article aims to explain the reason behind this observation and provide insights into understanding logical read counts for worktables.

Explanation

Logical reads are counted differently for worktables compared to conventional spool tables. In worktables, each row read translates into one "logical read." This is unlike the reporting of hashed pages for "real" spool tables.

The rationale for counting reads in this manner is that it provides more meaningful information for analysis. Tracking hashed pages for worktables is less useful due to the internal nature of these structures. Reporting rows spooled better reflects the actual utilization of tempdb resources.

Formula Derivation

The formula derived for predicting worktable logical reads is:

Worktable logical reads = 1 + (NumberOfRows * 2) + (NumberOfGroups * 4)

This formula accounts for the following:

  • 1: Represents the initial load of data into the worktable.
  • NumberOfRows * 2: The two secondary spools (created for reducing the cost of returning rows) are completely read twice.
  • NumberOfGroups * 4: The primary spool emits rows as explained below, resulting in the count of distinct group values (plus 1).

Primary Spool Row Emission

The primary spool, tasked with accumulating rows and performing the aggregate calculation, operates as follows:

  • Reads each row from the input and writes it to the worktable.
  • When a new group is encountered, it emits a row to the nested loops operator, indicating the start of a new group partition.
  • Averages for each group are computed using the rows in the worktable.
  • Averages are joined with the rows in the worktable.
  • The worktable is truncated to prepare for the next group.
  • To process the final group, the spool emits a dummy row.

Additional Considerations

In your test script, you noticed that replicating the same process resulted in fewer logical reads (11). This discrepancy is attributed to optimizing algorithms employed by the query processor in different environments. The formula remains valid in general cases where nested loops or hash joins are used.

Conclusion

Understanding the counting differences for logical reads in worktables is essential for accurately interpreting execution plans involving windowed aggregate functions. The formula provided offers a useful way to estimate worktable logical reads, aiding in performance analysis and optimization efforts.

The above is the detailed content of Why are logical reads so high when using windowed aggregate functions, especially with common subexpression spools?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn