


Notes on big data query optimization in mysql: 1. When optimizing queries, full table scans should be avoided as much as possible; 2. Null value judgments on fields in the where clause should be avoided; 3. in and Not in should also be used with caution; 4. Try to avoid using or in the where clause to connect; 5. Try to avoid using cursors.
Methods for optimizing big data queries in mysql:
1. When optimizing queries, try to avoid all For table scans, you should first consider creating indexes on the columns involved in where and order by.
2. Try to avoid judging the null value of the field in the where clause, otherwise the engine will give up using the index and perform a full table scan, such as: select id from t where num is null can be on num Set the default value 0, make sure there is no null value in the num column in the table, and then query like this:
select id from t where num=0
3. Try to avoid using != or operators in the where clause, otherwise the engine will give up Perform a full table scan using an index.
4. Try to avoid using or in the where clause to connect conditions, otherwise the engine will give up using the index and perform a full table scan, such as: select id from t where num=10 or num=20. Query like this:
select id from t where num=10 union all select id from t where num=20
5.in and not in should also be used with caution, otherwise it will lead to a full table scan, such as: select id from t where num in(1,2,3) For continuous values, you can Use between instead of in:
select id from t where num between 1 and 3
6. The following query will also result in a full table scan: select id from t where name like '%李%' To improve efficiency, you can consider full-text search.
7. If parameters are used in the where clause, it will also cause a full table scan. Because SQL resolves local variables only at runtime, the optimizer cannot defer selection of an access plan until runtime; it must make the selection at compile time. However, if the access plan is created at compile time, the values of the variables are still unknown and cannot be used as input for index selection. For example, the following statement will perform a full table scan: select id from t where num=@num
You can change it to force the query to use an index:
select id from t with(index(索引名)) where num=@num
8. Try to avoid using the index in the where clause. Fields perform expression operations, which will cause the engine to abandon using the index and perform a full table scan. For example: select id from t where num/2=100
should be changed to: select id from t where num=100*2
9. Try to avoid Performing functional operations on fields in the where clause will cause the engine to give up using the index and perform a full table scan. For example: select id from t where substring(name,1,3)='abc', the id whose name starts with abc should be changed to:
select id from t where name like ‘abc%’
10. Do not be on the left side of "=" in the where clause Perform functions, arithmetic operations, or other expression operations, otherwise the system may not use the index correctly.
11. When using an index field as a condition, if the index is a compound index, the first field in the index must be used as the condition to ensure that the system uses the index, otherwise the index will not will be used, and the field order should be consistent with the index order as much as possible.
12. Do not write meaningless queries. For example, if you need to generate an empty table structure:
select col1,col2 into #t from t where 1=0
This type of code will not return any result set, but it will consume system resources and should be changed. Like this:
create table #t(…)
13. Many times it is a good choice to use exists instead of in:
select num from a where num in(select num from b)
Use Replace the following statement:
select num from a where exists(select 1 from b where num=a.num)
14. Not all indexes are valid for queries. SQL optimizes queries based on the data in the table. When there is a large amount of duplicate data in the index column, the SQL query may not go Using indexes, for example, if there is a field sex in a table, and almost half are male and half female, then even if an index is built on sex, it will not have any effect on query efficiency.
15. The more indexes, the better. Although the index can improve the efficiency of the corresponding select, it also reduces the efficiency of insert and update, because the index may be rebuilt when inserting or updating. So what? Indexing requires careful consideration and will depend on the circumstances. It is best not to have more than 6 indexes on a table. If there are too many, you should consider whether it is necessary to build indexes on some columns that are not commonly used.
16. You should avoid updating clustered index data columns as much as possible, because the order of clustered index data columns is the physical storage order of table records. Once the column value changes, the order of the entire table records will be adjusted. It consumes considerable resources. If the application system needs to frequently update clustered index data columns, then you need to consider whether the index should be built as a clustered index.
17. Try to use numeric fields. If the fields contain only numerical information, try not to design them as character fields. This will reduce the performance of queries and connections, and increase storage overhead. This is because the engine will compare each character in the string one by one when processing queries and connections, and only one comparison is enough for numeric types.
18. Use varchar/nvarchar
instead of char/nchar
as much as possible, because firstly, the storage space of variable length fields is small, which can save storage space, and secondly, for queries That said, searching within a relatively small field is obviously more efficient.
19. Do not use select * from t
anywhere, replace "*" with a specific field list, and do not return any unused fields.
20. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, be aware that the indexes are very limited (only primary key indexes).
21. Avoid frequently creating and deleting temporary tables to reduce the consumption of system table resources.
22. Temporary tables are not unusable, and using them appropriately can make certain routines more efficient, for example, when you need to repeatedly reference a large table or a certain data set in a commonly used table. However, for one-off events, it's better to use an export table.
23. When creating a temporary table, if a large amount of data is inserted at one time, you can use select into
instead of create table
to avoid causing a large number of logs. Improve the speed; if the amount of data is not large, in order to alleviate the resources of the system table, you should first create the table and then insert.
24. If temporary tables are used, all temporary tables must be explicitly deleted at the end of the stored procedure, first truncate table
, and then drop table
, so Longer locking of system tables can be avoided.
25. Try to avoid using cursors because cursors are less efficient. If the data operated by the cursor exceeds 10,000 rows, you should consider rewriting it.
26. Before using the cursor-based method or the temporary table method, you should first look for a set-based solution to solve the problem. The set-based method is usually more effective.
27. Like temporary tables, cursors are not unusable. Using FAST_FORWARD cursors with small data sets is often better than other row-by-row processing methods, especially when several tables must be referenced to obtain the required data. Routines that include "totals" in a result set are usually faster than using a cursor. If development time permits, you can try both the cursor-based method and the set-based method to see which method works better.
28. Set SET NOCOUNT ON
at the beginning of all stored procedures and triggers, and set SET NOCOUNT OFF
at the end. There is no need to send a DONE_IN_PROC
message to the client after each statement of stored procedures and triggers is executed.
29. Try to avoid large transaction operations and improve system concurrency.
30. Try to avoid returning large amounts of data to the client. If the amount of data is too large, you should consider whether the corresponding requirements are reasonable.
More related free learning recommendations: mysql tutorial(Video)
The above is the detailed content of What should you pay attention to when optimizing queries for big data in MySQL?. For more information, please follow other related articles on the PHP Chinese website!

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于架构原理的相关内容,MySQL Server架构自顶向下大致可以分网络连接层、服务层、存储引擎层和系统文件层,下面一起来看一下,希望对大家有帮助。

在mysql中,可以利用char()和REPLACE()函数来替换换行符;REPLACE()函数可以用新字符串替换列中的换行符,而换行符可使用“char(13)”来表示,语法为“replace(字段名,char(13),'新字符串') ”。

方法:1、利用right函数,语法为“update 表名 set 指定字段 = right(指定字段, length(指定字段)-1)...”;2、利用substring函数,语法为“select substring(指定字段,2)..”。

mysql的msi与zip版本的区别:1、zip包含的安装程序是一种主动安装,而msi包含的是被installer所用的安装文件以提交请求的方式安装;2、zip是一种数据压缩和文档存储的文件格式,msi是微软格式的安装包。

转换方法:1、利用cast函数,语法“select * from 表名 order by cast(字段名 as SIGNED)”;2、利用“select * from 表名 order by CONVERT(字段名,SIGNED)”语句。

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于MySQL复制技术的相关问题,包括了异步复制、半同步复制等等内容,下面一起来看一下,希望对大家有帮助。

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了mysql高级篇的一些问题,包括了索引是什么、索引底层实现等等问题,下面一起来看一下,希望对大家有帮助。

在mysql中,可以利用REGEXP运算符判断数据是否是数字类型,语法为“String REGEXP '[^0-9.]'”;该运算符是正则表达式的缩写,若数据字符中含有数字时,返回的结果是true,反之返回的结果是false。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),
