search
HomeDatabaseMysql TutorialDetailed introduction to MySQL query optimization

1. Introduction

The most important thing about a good web application is excellent access performance. The database MySQL is an integral part of the web application and an important part that determines its performance. Therefore, it is crucial to improve the performance of MySQL.

The improvement of MySQL performance can be divided into three parts, including hardware, network, and software. Among them, hardware and network depend on the company's financial resources and require a lot of money, so I won't go into them here. The software is subdivided into many types. Here we achieve performance improvement through MySQL query optimization.

Recently I read some books about query optimization, and also read some articles written by seniors online.

The following is some summary of query optimization that I compiled and learned from:

2. Interception of SQL statements

1. Comprehensive query log

2. Slow query Log

3. Binary log

4. Process list

SHOW FULL PROCESSLIST;

. .

3. Basic analysis commands for query optimization

1. EXPLAIN {PARTITIONS|EXTENDED}

2. SHOW CREATE TABLE tab;

3. SHOW INDEXS FROM tab;

 4.SHOW TABLE STATUS LIKE 'tab';

 5.SHOW [GLOBAL|SESSION] STATUS LIKE '';

 6.SHOW VARIABLES

 . . . .

PS: I personally feel that all of them are nutritionally devoid of any nutrients. Here’s the real stuff.

4. Several directions for query optimization

1. Try to avoid full-text scanning, add indexes to corresponding fields, and use indexes to query

2. Delete unused or duplicate indexes

 3. Query rewriting, equivalent conversion (predicate, subquery, join query)

 4. Delete content and repeat unnecessary statements, and streamline statements

 5. Integrate repeatedly executed statements

 6. Cache query results

5. Index optimization

 5.1. Index advantages:

 1. Maintain data integrity

  2. Improve data query performance

  3. Improve table connection operations (jion)

  4. Sort query results. If there is no index, the internal file sorting algorithm will be used for sorting, which is slower. 5. Simplify aggregated data operations. 5.2. Disadvantages of indexing. 1. The index needs to occupy a certain amount of space. Storage space

  2. Data insertion, update, and deletion will be affected by the index, and performance will be reduced. Because the data changes, the index also needs to be updated

 3. Multiple indexes, if the optimizer takes time, the best choice

 5.3. Index selection

 1. When the amount of data is large Use

  2. When the data is highly repetitive,

will not be used 3. If the query retrieves more than 20% of the data, full-text scanning will be used without indexing

5.4. Detailed study of index

Data query:

InnoDB and MyISAM in MySQL are B-Tree type indexes

B-Tree includes: PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT

 B-Tree type index is not supported (that is, when the field uses the following symbols, the index will not be used):

  >, =, ,like '%**'

  【Here I will introduce the covering index first】

                            Around  I will introduce it in a way that I understand. Covering indexes do not really exist like primary key indexes and unique indexes. It is just a definition of certain specific scenarios for index application [another understanding: the queried column is an index column, so the column is covered by the index]. It can break through traditional limitations, use the above operators, and still use indexes for queries.

Because the queried column is an index column, there is no need to read the row, only the column field data needs to be read. [For example, if you are reading a book and need to find a certain content, and that content happens to appear in the table of contents, you don’t need to turn page by page, just locate the page in the table of contents and search]

  How to activate What about covering indexes? What is a specific scenario?

The index field just appears in the select.

Compound indexes may also have other special scenarios. For example, for a three-column composite index, you only need to have the leftmost column of the composite index appear once in select, where, group by, and order by to activate the use of the covering index.

View:

Extra in EXPLAIN displays Using index, indicating that this statement uses a covering index.

Conclusion:

It is not recommended to use select*from when querying. You should write the fields you need and add corresponding indexes to improve querying. performance.

Actual test results for the above operators: 1. In the form of select*from, where is the primary key and can be used to kill [except like] (use the primary key for query); index cannot be used at all Can.

2. Test in the form of select field a from tab where field a "above operator", the result can still be queried using the index. [Using covering index]

Other index optimization methods:

1. Use index keywords as connection conditions

2. Use compound indexes

3. Index merging or and, will involve The fields to be merged into a composite index

4. Add index to the fields involved in where, and group by

6. Subquery optimization

In from, it is a non-correlated subquery. Subqueries can be pulled up to the parent layer. In multi-table join queries, consider the join cost before selecting.

The query optimizer generally uses nested execution for subqueries, that is, executing the subquery once for each row in the parent query, so that the subquery will be executed many times. This execution method is very inefficient.

Advantages of converting subqueries into join queries:

1. The subquery does not need to be executed many times

2. The optimizer can choose different methods and connection sequences based on the information

 3. The connection conditions and filtering conditions of the subquery become the filtering conditions of the parent query to improve efficiency.

Optimization:

Subquery merging. If there are multiple subqueries, try to merge them as much as possible.

Subquery expansion, that is, pull-up becomes a multi-table query (equivalent changes are guaranteed at all times)

Note:

Subquery expansion can only expand simple queries. If the subquery If the query contains aggregate functions, GROUP BY, and DISTINCT, it cannot be pulled up.

Select * from t1 (select*from tab where id>10) as t2 where t1.age>10 and t2.age

select*from t1,tab as t2 where t1.age>10 and t2.age10;

Specific steps:

1. Merge from and from and modify the corresponding parameters

2 , merge where with where, use and to connect

3. Modify the corresponding predicate (change = in in)

7. Rewrite the equivalent predicate:

1. BETWEEEN AND Rewrite it as >=,

 2. In converts multiple or. When the field is an index, both can use the index, or is more efficient than in

3. Name like 'abc%' is rewritten as name>='abc' and name

Note: In the million-level data test, the like query before name is not indexed is faster than the latter query; after adding an index to the field, the latter query is a little faster, but there is not much difference, because both methods are used when querying To the index.

 . . . .

8. Condition simplification and optimization

1. Combine where, having (when there are no groupby and aggregate functions), and join-on conditions as much as possible

2. Delete unnecessary parentheses, reduce the or and and tree layers of syntax, and reduce CPU consumption

 3. Constant transfer. a=b and b=2 is converted to a=2 and b=2. Try not to use variables a=b or a=@var

4. Eliminate useless SQL conditions

5. Try not to calculate expressions on the right side of the where equal sign; do not use fields in where Calculate expressions and use functions

 6. Identity transformation and inequality transformation. Example: Testing millions of data a>b and b>10 becomes a>b and a>10 and b>10 with significant optimization

9, external connection optimization

About to convert external connections For inner joins

Advantages:

1. The optimization processor handles outer joins in more steps than inner joins and is time-consuming

2. After the outer joins are eliminated, the optimizer selects multiple tables There are more choices for the connection sequence, you can choose the best

 3. You can use the table with the strictest filtering conditions as the outer surface (the front of the connection sequence is the outer loop layer of the multi-layer loop body),

It can reduce unnecessary I/O overhead and speed up algorithm execution.

The difference between on a.id=b.id and where a.id=b.id, on means the table is connected, and where means data comparison

Note: The premise must be that the result is NULL avoidance (that is, the condition is restricted to no NULL data rows, semantically speaking, it is an inner connection)

Optimization principles:

Streamline queries, eliminate connections, equivalent conversions, and remove redundant table object connections

For example: the primary key/unique key is used as the connection condition, and the intermediate table column is only used as the equivalent condition, the intermediate table connection can be removed

10. Other query optimization

1. The following will be Causes the index query to be abandoned and full-text scanning is used

 1.1. Use the != or operator in the where clause. Note: Primary key support. Non-primary keys do not support

1.2. Avoid using or used, so the specific situation should be analyzed on a case-by-case basis.

Similar optimization:

select * from tab name='aa' or name='bb';

Select * from tab name='aa'

   union all

   select * from tab name='bb';

    Actual measurement:

   1. One hundred thousand data test, Without any index, the above query is twice as fast as the query below.

2. In the 300,000 data test, when aa and bb are indexed separately, the following query speed is a little faster than or.

 1.3. Avoid using not in

  Not in generally cannot use indexes; primary key fields can

  1.4. Try to avoid using the judgment of null in where 

 1.5. like cannot be preceded by a percent sign like '%.com'

Solution:

1. If you must use % prefix and the data length is not large, such as URL, you can flip the data and save it Enter the database and check again. LIKE REVERSE'%.com';

                                             

  1.6. When using an index field as a condition, if it is a compound index, the field name with the leftmost prefix of the index should be used

2. Replace exists with in

Select num from a where num in(select num from b)

Select num from a where exists(select 1 from b where num =a.num)

With one million pieces of data, it takes 6.65s and 4.18s to filter 59417 pieces of data. No other optimizations were done, just replacing exists with in.

 3. The field definition is a string. There are no quotation marks when querying, and no index will be used. Full-text scanning will be performed.

[The following is an excerpt from Luantanqin’s blog post http://www.cnblogs.com/lingiu/p/3414134.html. I have not conducted the corresponding test]

4. Try to use it as much as possible table variables instead of temporary tables

5. Avoid frequently creating and deleting temporary tables to reduce the consumption of system table resources

6. If a temporary table is used, be sure to add it at the end of the stored procedure To delete all temporary tables explicitly, first truncate table, and then drop table, this can avoid long-term locking of system tables

 7. Try to avoid using cursors, because cursors are less efficient. If the cursor operation is If the data exceeds 10,000 rows, then you should consider rewriting

8. Large data volume. If the data volume is too large, you should consider whether the corresponding requirements are reasonable.

 9. Try to avoid large transaction operations and improve system concurrency.

The above is the detailed content of Detailed introduction to MySQL query optimization. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How does MySQL's licensing compare to other database systems?How does MySQL's licensing compare to other database systems?Apr 25, 2025 am 12:26 AM

MySQL uses a GPL license. 1) The GPL license allows the free use, modification and distribution of MySQL, but the modified distribution must comply with GPL. 2) Commercial licenses can avoid public modifications and are suitable for commercial applications that require confidentiality.

When would you choose InnoDB over MyISAM, and vice versa?When would you choose InnoDB over MyISAM, and vice versa?Apr 25, 2025 am 12:22 AM

The situations when choosing InnoDB instead of MyISAM include: 1) transaction support, 2) high concurrency environment, 3) high data consistency; conversely, the situation when choosing MyISAM includes: 1) mainly read operations, 2) no transaction support is required. InnoDB is suitable for applications that require high data consistency and transaction processing, such as e-commerce platforms, while MyISAM is suitable for read-intensive and transaction-free applications such as blog systems.

Explain the purpose of foreign keys in MySQL.Explain the purpose of foreign keys in MySQL.Apr 25, 2025 am 12:17 AM

In MySQL, the function of foreign keys is to establish the relationship between tables and ensure the consistency and integrity of the data. Foreign keys maintain the effectiveness of data through reference integrity checks and cascading operations. Pay attention to performance optimization and avoid common errors when using them.

What are the different types of indexes in MySQL?What are the different types of indexes in MySQL?Apr 25, 2025 am 12:12 AM

There are four main index types in MySQL: B-Tree index, hash index, full-text index and spatial index. 1.B-Tree index is suitable for range query, sorting and grouping, and is suitable for creation on the name column of the employees table. 2. Hash index is suitable for equivalent queries and is suitable for creation on the id column of the hash_table table of the MEMORY storage engine. 3. Full text index is used for text search, suitable for creation on the content column of the articles table. 4. Spatial index is used for geospatial query, suitable for creation on geom columns of locations table.

How do you create an index in MySQL?How do you create an index in MySQL?Apr 25, 2025 am 12:06 AM

TocreateanindexinMySQL,usetheCREATEINDEXstatement.1)Forasinglecolumn,use"CREATEINDEXidx_lastnameONemployees(lastname);"2)Foracompositeindex,use"CREATEINDEXidx_nameONemployees(lastname,firstname);"3)Forauniqueindex,use"CREATEU

How does MySQL differ from SQLite?How does MySQL differ from SQLite?Apr 24, 2025 am 12:12 AM

The main difference between MySQL and SQLite is the design concept and usage scenarios: 1. MySQL is suitable for large applications and enterprise-level solutions, supporting high performance and high concurrency; 2. SQLite is suitable for mobile applications and desktop software, lightweight and easy to embed.

What are indexes in MySQL, and how do they improve performance?What are indexes in MySQL, and how do they improve performance?Apr 24, 2025 am 12:09 AM

Indexes in MySQL are an ordered structure of one or more columns in a database table, used to speed up data retrieval. 1) Indexes improve query speed by reducing the amount of scanned data. 2) B-Tree index uses a balanced tree structure, which is suitable for range query and sorting. 3) Use CREATEINDEX statements to create indexes, such as CREATEINDEXidx_customer_idONorders(customer_id). 4) Composite indexes can optimize multi-column queries, such as CREATEINDEXidx_customer_orderONorders(customer_id,order_date). 5) Use EXPLAIN to analyze query plans and avoid

Explain how to use transactions in MySQL to ensure data consistency.Explain how to use transactions in MySQL to ensure data consistency.Apr 24, 2025 am 12:09 AM

Using transactions in MySQL ensures data consistency. 1) Start the transaction through STARTTRANSACTION, and then execute SQL operations and submit it with COMMIT or ROLLBACK. 2) Use SAVEPOINT to set a save point to allow partial rollback. 3) Performance optimization suggestions include shortening transaction time, avoiding large-scale queries and using isolation levels reasonably.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor