Home  >  Article  >  Database  >  MySQL learning to talk about query statement execution process

MySQL learning to talk about query statement execution process

青灯夜游
青灯夜游forward
2023-01-11 20:38:551223browse

If you want to learn MySQL in depth, you should start from the macro architecture. In this article, we will learn the process of executing MySQL query statements. I hope it will be helpful to everyone!

MySQL learning to talk about query statement execution process

The MySQL version of this article is 8.0.18

Architecture diagram

Parser

The function of the parser is to perform the following work on the SQL statement sent from the client:

  • Grammar Parsing: Check the syntax of the SQL statement, whether the brackets and quotation marks are closed, etc.
  • Lexical parsing: Split the keywords, table names, and field names in the SQL statement into nodes, and finally obtain a parse tree

Preprocessor

The parser mainly checks the grammar and lexicon, but if the grammar and lexicon are correct, but the table , the field does not exist, then this SQL statement cannot be executed correctly.

So the role of the preprocessor is: Semantic parsing, to determine whether the semantics of the parse tree is correct and whether tables and fields exist. After preprocessing, a new parse tree will be obtained.

Query optimizer

Query optimizer structure

The execution method of a SQL statement in MySQL is as follows Although the same results will be obtained in the end, there are differences in overhead. The specific execution method chosen is determined by the query optimizer. For example:

  • There are multiple indexes in the table that can be selected. Which index should be selected?
  • When we perform related queries on multiple tables, which table’s data should be used? For the benchmark table

The query optimizer is a cost-based optimizer. Its working principle is to evaluate various execution plans based on the parse tree. The cost required for the execution method, will eventually get an execution plan with the minimum cost as the final solution .

However, this execution method with the smallest overhead is not necessarily the optimal execution method. For example, an index should be used, but a full table scan is performed. Although there are two words "optimization" in the query optimizer, this optimization is not omnipotent. In many cases, it is more necessary to consider whether the SQL statement is written reasonably.

Logical query optimization

Logical query optimization is mainly responsible for performing some relational algebra to optimize SQL statements, thereby making SQL statement execution more efficient

We can use several cases to briefly understand logical query optimization

  • Subquery merging

    Before merging

    SELECT * FROM t1 WHERE a1<10 AND (
      EXISTS(SELECT a2 FROM t2 WHERE t2.a2<5 AND t2.b2=1) OR
      EXISTS(SELECT a2 FROM t2 WHERE t2.a2<5 AND t2.b2=2)
    );

    After merging

    SELECT * FROM t1 WHERE a1<10 AND (
      EXISTS(SELECT a2 FROM t2 WHERE t2.a2<5 AND (t2.b2=1 OR t2.b2=2)
    );

    Merge multiple subqueries by merging query conditions, and reduce multiple connection operations to a single table scan and a single connection

  • Equivalent predicate rewriting

    Like the familiar like fuzzy query, % is written after the condition before the index range query is performed. In fact, this is the credit of the query optimizer

    Assume that the conditions used are all indexed, before rewriting

    SELECT * FROM USERINFO WHERE name LIKE &#39;Abc%&#39;;

    After rewriting

    SELECT * FROM USERINFO WHERE name >= &#39;Abc&#39; AND name < &#39;Abd&#39;;

    This is why the answer to index range query

  • Conditional simplification

    Conditional simplification is also used Some equations and algebraic relationships are used to achieve simplification

    • Remove redundant brackets in expressions and reduce the levels of AND and OR trees generated during syntax analysis, such as((a AND b) AND (c AND d)) is simplified to a AND b AND c AND d
    • Constant transfer, such as col1 = col2 AND col2 = 3 is simplified to col1 = 3 AND col2 = 3
    • Expression calculation, some expressions that can be directly solved will be converted into the final calculation result, such as col1 = 1 2 Simplification For col1 = 3

##Physical query optimization

The main work of physical query optimization is based on SQL Statements evaluate the cost of multiple execution plans respectively

Physical query optimization mainly solves the following problems:

  • Which method is the least expensive in single table scanning? (scan index back to table or full table scan)

  • When there is a table connection, which connection method is the least expensive to use

Simple Learn about cost evaluation. Cost evaluation is based on the two dimensions of CPU cost and IO cost.

Scanning methodCost evaluation formulaSequential scanN_page * a_page_IO_time N_tuple * a_tuple_CPU_timeIndex scanC_index N_page_index * a_page_IO_time

The above parameters are explained as follows:

  • a_page_IO_time, the IO time of loading a data page is
  • N_page, the number of data pages is
  • N_tuple, the number of tuples ( A tuple is understood as a row of data)
  • a_tuple_CPU_time, the CPU time spent on parsing a tuple from the data page is
  • C_index, the IO time spent on the index is
  • N_page_index, the index page Quantity

For information on index cost calculation, please refer to this article:Why did MySQL query choose to use this index? ——Based on MySQL 8.0.22 index cost calculation

Execution plan

The execution plan is the product of the query optimizer and will eventually be handed over to the storage engine for execution . The execution plan can help us know how MySQL will execute this SQL statement.

Use the explain keyword to view the execution plan of the SQL statement, and you can get the following information:

  • id: The execution order of the query in the nested query
  • possible_keys: Indexes that may be used in this query
  • Key: Actual indexes used
  • rows: Approximately how many rows of data need to be retrieved to get the result
  • select_type many Connection type between tables
  • extra: additional information, whether there is index coverage, index pushdown, etc.

Storage engine

The MySQL server stipulates specifications for how data is stored, extracted, and updated. This specification is implemented by storage engines. Different storage engines have different implementation methods, so different storage engines will present their unique functions and characteristics. The most commonly used storage engines are InnoDB and MyISAM

Let’s briefly talk about the characteristics of these two storage engines

InnoDB:

  • Supports foreign keys and transactions, ensuring Improves the integrity and consistency of data
  • Supports finer lock granularity, better control of locks, and higher reading and writing efficiency

MyISAM

  • Does not support transactions, only supports row locks, suitable for read-only data scenarios

The storage engine will not be expanded on for the time being, and will continue to be interspersed with their comparisons in other articles, as well as details. Analyze the process of updating data in InnoDB

Summary

In the past, I only knew how to write SQL statements on the client software, click to execute, and get the data

Now I finally understand that after a query statement is passed into the MySQL server, it needs to go through this series of operations

  • The parser checks the syntax and lexicon of this SQL statement. , if there are no errors, it will be split into nodes according to keywords, and finally a parse tree will be formed

  • The preprocessor will check the semantics of the SQL statement and check whether the SQL statement is ambiguous , fields, etc., to form a new parse tree

  • The query optimizer gets the various execution plans generated by this parse tree, and obtains them after logical query optimization and physical query optimization An execution plan with minimal overhead

  • The execution engine gets this execution plan and calls the storage engine interface

  • The storage engine processes data according to the execution plan Query, the query will query and call some interfaces of the file system in the operating system, complete the data query, and finally return to the client

[Related recommendations: mysql video tutorial]

The above is the detailed content of MySQL learning to talk about query statement execution process. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete