SQL statements can perform data manipulation and data definition, which can bring great convenience to users. This article will mention 52 SQL statement performance optimization strategies. Friends in need are recommended to collect it.
SQL statement performance optimization strategy
1. When optimizing the query, try to avoid the entire table When scanning, you should first consider creating indexes on the columns involved in WHERE
and ORDER BY
.
2. Try to avoid making NULL value judgments on fields in the WHERE clause. NULL is the default value when creating a table, but most of the time you should use NOT NULL
, or use a special Values such as 0
, -1
are used as default values.
3. Try to avoid using the != or a8093152e673feb7aba1828c43532094 operator in the WHERE clause. MySQL uses indexes only for the following operators: e39901eede002d95a8e5b5997969f66f
, > ;=
, BETWEEN
, IN
, and sometimes LIKE
.
4. Try to avoid using OR in the WHERE clause to connect conditions, otherwise the engine will give up using the index and perform a full table scan. You can use UNION to merge queries.
5. IN and NOT IN should also be used with caution, otherwise it will lead to a full table scan. For continuous values, do not use IN if you can use BETWEEN.
6. The following query will also cause a full table scan:
select id from t where name like‘%abc%’//用到索引
or
select id from t where name like‘%abc’//若要提高效率,可以考虑全文检索
7. If parameters are used in the WHERE clause, it will also cause Full table scan.
8. Try to avoid expression operations and function operations on fields in the WHERE clause.
9. Many times it is a good choice to use EXISTS
instead of IN.
10. Although the index can improve the efficiency of the corresponding SELECT, it also reduces INSERT
and UPDATE
. Because the index may be rebuilt during INSERT or UPDATE, it is best not to have more than 6 indexes on a table.
11. Avoid updating clustered
index data columns as much as possible, because the order of clustered index data columns is the physical storage order of table records. Once the column value changes, it will cause Adjusting the order of records in the entire table will consume considerable resources.
12. Try to use numeric types. If fields that only contain numerical information try not to design them as character types. This will reduce the performance of queries and connections, and increase storage.
13. Use varchar
, nvarchar
instead of char
, nchar
as much as possible. Because first of all, long fields have small storage space and can save storage space. For queries, searching in a relatively small field is obviously more efficient.
14. It is best not to use return all: select from t
, replace "*" with a specific field list, and do not return any unused fields.
15. Try to avoid returning large amounts of data to the client. If the amount of data is too large, you should consider whether the corresponding requirements are reasonable.
16. Use table aliases (Alias): When connecting multiple tables in a SQL statement, please use table aliases and prefix the alias to each Column
. This reduces parsing time and reduces syntax errors caused by Column ambiguities.
17. Use "temporary table" to temporarily store intermediate results.
An important way to simplify SQL statements is to use temporary tables to temporarily store intermediate results. Temporarily store the temporary results in the temporary table, and subsequent queries will be in tempdb
. This can avoid multiple scans of the main table in the program, and also greatly reduces the "shared lock" blocking "update lock" during program execution. , reducing blocking and improving concurrency performance.
18. Some SQL query statements should be added with nolock
. Reading and writing will block each other in order to improve concurrency performance. For some queries, you can add nolock, which allows writing when reading, but the disadvantage is that uncommitted dirty data may be read.
There are three principles for using nolock:
If the query results are used for "insertion, deletion, and modification", nolock cannot be added;
The queried table is one where page splits occur frequently, so use nolock with caution;
Using a temporary table can also save the "data foreshadow". It functions like Oracle's undo table space and can use temporary tables to improve concurrency performance. Do not use nolock.
19. Common simplification rules are as follows: Do not have more than 5 table connections (JOIN
). Consider using temporary tables or table variables to store intermediate results. Use less subqueries, and do not nest views too deeply. Generally, it is appropriate to nest no more than 2 views.
20. Pre-calculate the results to be queried and put them in the table, and Select
when querying.
21、用 OR 字句可以分解成多个查询,并且通过 UNION
连接多个查询。他们的速度与是否使用索引有关,如果查询需要用到联合索引,用 UNION all
执行的效率更高。多个 OR 的字句没有用到索引,改写成 UNION
的形式再试图与索引匹配。
22、在IN后面值的列表中,将出现最频繁的值放在最前面,出现得最少的放在最后面,减少判断次数。
23、尽量将数据的处理工作放在服务器上,如使用存储过程。存储过程是编译好、优化过、并且被组织到一个执行规划、且存储在数据库中的 SQL 语句,是控制流语言的集合,速度当然快。反复执行的动态 SQL,可以使用临时存储过程,该过程(临时表)被放在 Tempdb
中。
24、当服务器的内存够多时,配制线程数量 = 最大连接数+5,这样能发挥最大的效率;否则使用配制线程数量cdc8cd205a17bcb60976ab750b49d81b=”,不要使用 “>”。
28、索引的使用规范:
索引的创建要与应用结合考虑,建议大的 OLTP 表
不要超过 6 个索引;
尽可能的使用索引字段作为查询条件,尤其是聚簇索引,必要时可以通过 index index_name
来强制指定索引;
避免对大表查询时进行 table scan
,必要时考虑新建索引;
在使用索引字段作为条件时,如果该索引是联合索引,那么必须使用到该索引中的第一个字段作为条件时才能保证系统使用该索引,否则该索引将不会被使用;
要注意索引的维护,周期性重建索引,重新编译存储过程。
29、下列 SQL 条件语句中的列都建有恰当的索引,但执行速度却非常慢:
SELECT * FROM record WHERE substrINg(card_no, 1, 4) = '5378' --13秒 SELECT * FROM record WHERE amount/30 < 1000 --11秒 SELECT * FROM record WHERE convert(char(10), date, 112) = '19991201' --10秒
分析: WHERE 子句中对列的任何操作结果都是在 SQL 运行时逐列计算得到的,因此它不得不进行表搜索,而没有使用该列上面的索引。如果这些结果在查询编译时就能得到,那么就可以被 SQL 优化器优化,使用索引,避免表搜索,因此将 SQL 重写成下面这样:
SELECT * FROM record WHERE card_no like '5378%' -- < 1秒 SELECT * FROM record WHERE amount < 1000*30 -- < 1秒 SELECT * FROM record WHERE date = '1999/12/01' -- < 1秒
30、当有一批处理的插入或更新时,用批量插入或批量更新,绝不会一条条记录的去更新。
31、在所有的存储过程中,能够用 SQL 语句的,绝不用循环去实现。
32、选择最有效率的表名顺序(只在基于规则的优化器中有效):
Oracle 的解析器按照从右到左的顺序处理 FROM 子句中的表名,FROM 子句中写在最后的表(基础表 driving table)将被最先处理,在 FROM 子句中包含多个表的情况下,必须选择记录条数最少的表作为基础表。如果有 3 个以上的表连接查询,那就需要选择交叉表(intersection table)作为基础表,交叉表是指那个被其他表所引用的表。
33、提高 GROUP BY
语句的效率,可以通过将不需要的记录在 GROUP BY 之前过滤掉。
34、SQL 语句用大写,因为 Oracle 总是先解析 SQL 语句,把小写的字母转换成大写的再执行。
35、别名的使用,别名是大型数据库的应用技巧,就是表名、列名在查询中以一个字母为别名,查询速度要比建连接表快 1.5 倍。
36、避免死锁,在你的存储过程和触发器中访问同一个表时总以相同的顺序;事务应尽可能地缩短,减少数据量的涉及;永远不要在事务中等待用户输入。
37、避免使用临时表,除非有需要,可以使用表变量代替。大多数时候(99%),表变量驻扎在内存中,因此速度比临时表更快,临时表驻扎在 TempDb 数据库中,因此临时表上的操作需要跨数据库通信,速度自然慢。
38、最好不要使用触发器:
触发,执行一个触发器事件本身就是一个耗费资源的过程;
如果能够使用约束实现的,尽量不要使用触发器;
不要为不同的触发事件(Insert、Update 和 Delete
)使用相同的触发器;
Do not use transactional code in triggers.
39. Index creation rules:
The primary key and foreign key of the table must have indexes;
Tables with more than 300 data volumes should have indexes;
Tables that are frequently connected to other tables should have indexes on the connection fields. ;
Fields that often appear in the WHERE clause, especially fields in large tables, should be indexed;
The index should be built in On fields with high selectivity;
Indices should be built on small fields. For large text fields or even long fields, do not build indexes;
The establishment of a composite index requires careful analysis, and try to consider using a single-field index instead;
Correctly select the main column field in the composite index, generally a field with better selectivity ;
Do several fields of a composite index often appear in the WHERE clause in AND mode at the same time? Are there few or no single-field queries? If so, you can create a composite index; otherwise consider a single field index;
If the fields included in the composite index often appear alone in the WHERE clause, then break it into multiple single fields Index;
If the compound index contains more than 3 fields, carefully consider the necessity and reduce the number of compound fields;
If There are both single-field indexes and composite indexes on these fields. Generally, the composite index can be deleted;
For tables that frequently perform data operations, do not create too many indexes;
Delete useless indexes to avoid negative impacts on the execution plan;
Each index created on the table will increase storage overhead, and the index is Insertion, deletion, and update operations also increase processing overhead. In addition, too many compound indexes are generally of no value when there are single-field indexes; on the contrary, they will also reduce the performance when data is added and deleted, especially for frequently updated tables, the negative impact is even greater big.
Try not to index a field in the database that contains a large number of duplicate values.
40. Summary of MySQL query optimization:
Use slow query logs to discover slow queries, and use execution plans to determine whether queries are To run properly, always test your queries to see if they are running optimally.
Performance will change over time, avoid using count(*)
on the entire table, it may lock the entire table, making the query consistent for subsequent similar queries Queries can use the query cache, use GROUP BY
instead of DISTINCT
where appropriate, use indexed columns in the WHERE, GROUP BY, and ORDER BY clauses, and keep indexes simple , do not include the same column in multiple indexes.
Sometimes MySQL will use the wrong index, for this case use USE INDEX
, check the problem using SQL_MODE=STRICT
, for For index fields with less than 5 records, using LIMIT
during UNION does not mean using OR.
In order to avoid SELECT before updating, use INSERT ON DUPLICATE KEY
or INSERT IGNORE
; do not use UPDATE to implement, do not use MAX; Using indexed fields and ORDER BY
clauses LIMIT M
, N can actually slow down queries in some cases, use sparingly, use UNION in the WHERE clause instead of a subquery , before restarting MySQL, remember to warm up your database to ensure data is in memory and queries are fast, and consider persistent connections instead of multiple connections to reduce overhead.
Benchmark queries include using the load on the server. Sometimes a simple query can affect other queries. When the load increases on the server, use SHOW PROCESSLIST
to view slowness. Queries and problematic queries, all suspicious queries are tested on the mirrored data produced in the development environment.
41. MySQL backup process:
Backup from the secondary replication server;
Stop replication during the backup to avoid inconsistencies in data dependencies and foreign key constraints;
Completely stop MySQL and remove the database from the database File backup;
If you use MySQL dump for backup, please back up the binary log file at the same time to ensure that the replication is not interrupted;
Do not trust LVM Snapshots, which are likely to produce data inconsistencies, will cause you trouble in the future;
In order to make single-table recovery easier, if the data is isolated from other tables, use the table as a unit export data.
Please use –opt
when using mysqldump
;
Check the sum before backing up Optimize table;
In order to import faster, foreign key constraints and uniqueness detection are temporarily disabled during import;
Calculate the database and table after each backup and the size of the index to be able to monitor the growth of data size;
Perform regular backups.
42. The query buffer does not automatically handle spaces. Therefore, when writing SQL statements, you should try to reduce the use of spaces, especially the spaces at the beginning and end of SQL (because the query buffer Leading and trailing spaces are not automatically intercepted).
43. Can member use mid as the standard to divide the table into tables for easy query? In general business requirements, username is basically used as the query basis. Normally, username should be used as a hash modulus to divide tables. As for partitioning tables, MySQL's partition
function does this and is transparent to the code; it seems unreasonable to implement it at the code level.
44. We should set an ID as the primary key for each table in the database, and the best one is an INT type (UNSIGNED
is recommended), and set it Automatically incremented AUTO_INCREMENT
flag.
45. Set SET NOCOUNT ON
at the beginning of all stored procedures and triggers, and set SET NOCOUNT OFF
at the end. There is no need to send a DONE_IN_PROC
message to the client after each statement of stored procedures and triggers.
46. MySQL query can enable high-speed query cache. This is one of the effective MySQL optimization methods to improve database performance. When the same query is executed multiple times, it is much faster to pull the data from the cache and return it directly from the database.
47, EXPLAIN SELECT
Query is used to track the viewing effect:
Using the EXPLAIN keyword can let you know how MySQL processes your SQL statement. This can help you analyze the performance bottlenecks of your query statements or table structures. EXPLAIN query results will also tell you how your index primary keys are used and how your data tables are searched and sorted.
48. Use LIMIT 1 when there is only one row of data:
When you query the table, you already know that there will only be one result, but because you You may need to fetch the cursor, or you may check the number of records returned. In this case, adding LIMIT 1
can increase performance. In this way, the MySQL database engine will stop searching after finding a piece of data, instead of continuing to search for the next piece of data that matches the record.
49. Select the appropriate storage engine for the table:
myisam: The application is mainly based on read and insert operations, with only a small amount of updates and deletions, and the Transaction integrity and concurrency requirements are not very high.
InnoDB: Transaction processing, and data consistency required under concurrent conditions. In addition to inserts and queries, it also includes many updates and deletes. (InnoDB effectively reduces locking caused by deletes and updates).
For InnoDB type tables that support transactions, the main reason that affects the speed is that AUTOCOMMIT
The default setting is on, and the program does not explicitly call BEGIN to start the transaction, causing each row to be inserted. All are submitted automatically, which seriously affects the speed. You can call begin before executing SQL. Multiple SQLs form one thing (even if autocommit is turned on), which will greatly improve performance.
50. Optimize the data type of the table and choose the appropriate data type:
Principle: Smaller is usually better, simple is better, all fields are There must be a default value and try to avoid NULL. MySQL can support the access of large amounts of data very well, but generally speaking, the smaller the table in the database, the faster the queries executed on it will be. Therefore, when creating a table, in order to obtain better performance, we can set the width of the fields in the table as small as possible.
Similarly, if possible, we should use MEDIUMINT
instead of BIGIN
to define integer fields, and we should try to set the field to NOT NULL
, so that when executing queries in the future, the database does not need to compare NULL values.
For some text fields, such as "province" or "gender", we can define them as ENUM
type. Because in MySQL, the ENUM type is treated as numeric data, and numeric data is processed much faster than text types. In this way, the performance of the database can be improved.
51. String data type: char, varchar, text.
52. Any operation on the column will result in table scan, which includes database functions, calculation expressions, etc. When querying, the operation should be moved to the right side of the equal sign as much as possible.
Recommended tutorial: "MySQL Tutorial"