Highly liked sharing: MySQL optimization ideas that are in line with production-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

Highly liked sharing: MySQL optimization ideas that are in line with production

藏色散人

Nov 05, 2021 pm 02:16 PM

mysql

Preface

The starting point of writing this article is to record my accumulated experience in dealing with data problems at work. As I write and write, I find that every point will lead to Other background knowledge should be provided, such as when optimizing indexes, you need to have a certain understanding of slow query, Explain and other related functions. For example, introducing Elasticsearch requires solving data synchronization, learning Elasticsearch knowledge, etc. Due to the length of the article, it is impossible to cover every point. They are all detailed like video tutorials, and I can only summarize them based on my limited knowledge and some general points. Even so, the length of the article is already very long. If you are interested in a certain point, please go to Baidu/Google for in-depth knowledge of individual details.

The article is quite long, so if you are interested, you may want to read it through. I hope you haven’t wasted dozens of minutes. [Recommended learning: "mysql video tutorial"]

Thinking Perspective

Database technology has gone through the manual management stage and the file system stage so far. and database system stage.

In the early days when there was no software system, real-world operations of a certain business could also be realized through the manual management stage of manual accounting and verbal agreements. This form existed for a long time and was relatively inefficient. a plan. In the next stage, with the development of computer technology, there was a file system stage that replaced manual accounting with excel tables, which improved productivity to a certain extent. In the software system stage, which is a database system with simple operation and high efficiency, productivity has been improved again, specific problems in the real world are abstracted into data, and real-world business is represented through the flow and change of data. In software systems, data storage is generally composed of a relational database and multiple non-relational databases.

The database is strongly related to the system business. This requires the product manager to understand the data storage and query process when designing the business. At the beginning of the design, it is clear what impact the business changes will have on the database and whether A new technology stack needs to be referenced. For example, a business designed by the product manager is to conduct statistical analysis and summary of data on multiple MySQL tables with a single table volume of millions. If MySQL multi-table query is directly used, slow queries will definitely occur and cause the msyql service to go down. In this case, the solution is Either compromise on the product side or change the technology stack.

In the system architecture and database solution, we should choose the one that is more suitable for the company's team capabilities. In the early stage of the system, simple database optimization with banknote capabilities will be the most cost-effective solution, but when it comes to mysql database banknote capabilities, there is nothing we can do. , introducing software services that focus on key functions will become the most cost-effective solution. How to choose the appropriate solution when encountering problems is the time to reflect your value.

A poor boy falls in love with a rich girl. The short-term sweetness cannot match the real class inequality. The happy ending only exists in the fantasy of the poor boy and the TV series of Teacher Qiong Yao.

How to improve the performance of data storage at a limited cost is the central idea of this article.

Background knowledge

I believe that everyone will often come into contact with the following content in their daily work. Let me briefly summarize it.

Relational database

Relational database is a data organization composed of two-dimensional tables and the relationships between them, providing transaction data consistency, Functions such as data persistence are the core storage services of software systems. They are the databases we most often come into contact with during development and interviews. For some small outsourcing projects, one MySQL is enough to meet all business needs. It is something that we often come into contact with, and it is actually full of tricks. We will discuss the tricks in detail in the following chapters.
Advantages:

Transaction
Persistence
Relatively common SQL language

Problems

The requirements for hard disk I/O are very high
The aggregation query efficiency of large amounts of data is low
Index misses
The leftmost matching principle of the index leads to mismatches Suitable for full-text retrieval
Improper use of transactions will cause lock congestion
Various problems caused by horizontal expansion are difficult to deal with

Non-relational Type database - NoSql

MySQL database, as a relational data storage software, has advantages and obvious disadvantages. Therefore, when the data volume of the software system continues to expand and the business complexity continues to increase, We cannot expect to solve all problems by enhancing the capabilities of the MySQL database. Instead, we need to introduce other storage software and use various types of NoSQL to solve the problems of the software system's expanding data volume and increasing business complexity.
Relational database is an optimization of relational database in different scenarios. It does not mean that everything will be fine if you introduce some kind of NoSQL. It means that you should fully understand the types and application difficulties of NoSQL on the market and choose the appropriate storage in the appropriate scenario. Software is the way to go.

Key-Value type

In business, the contents of certain tables are often queried, but most of the query results remain unchanged, so Key-value storage software, mainly Memcached and Redis, has emerged and is widely used in cache modules in the system. Redis has more data structures and persistence than Memcached, making it the most widely used among KV-type NoSQL.

Search type

In the scenario of full-text search, query optimization of MySQLB tree index, like query cannot hit the index, and every like keyword query is one time Full table scan can be supported in tables with tens of thousands of data, but slow queries will occur when the data is at the end. If the business code is not well written and the Like query is called in the transaction, a read lock will occur. ElasticSearch, with inverted index as its core, can perfectly meet the scenario of full-text search. At the same time, ElasticSearch also supports massive data very well, and the documentation and ecology are also very good. ElasticSearch is a representative product of search type.

Document type

Document type NoSql refers to a type of NoSql that stores semi-structured data as documents. Document type NoSql usually stores data in JSON or XML format. , so document-type NoSql does not have Schema. Since there is no Schema feature, we can store and read data at will. Therefore, the emergence of document-type NoSql solves the problem of inconvenient expansion of relational database table structures. The author has never used

column formula

For enterprises of a certain size, the business often involves some real-time and flexible data summary, which is not suitable for this kind of business Use the solution of calculating in advance to solve the problem. Even if you can write the business using the solution of calculating and summarizing in advance, as the number of summarized data increases, the final step of accumulating the summarized data will gradually become very slow. Column-based NoSQL is the product of this scenario. It is one of the most representative technologies in the big data era. The most common one is HBase, but the application of HBase is very heavy and often requires a complete set of Hadoop ecosystem to run. The author's company Alibaba Cloud's AnalyticDB is used, a column storage software compatible with MySql query statements. The powerful query capabilities of summary column storage software are sufficient to support various real-time and flexible data summary services.

Case

Taking 2021 as the time node, most systems start with the following plan in the early stage. Next, I will use this case Make some adjustments slowly.

Highly liked sharing: MySQL optimization ideas that are in line with production

#The benefits brought by hardware upgrades are lower as time goes by. This is the fastest optimization solution when time and personnel are tight. The benefits brought by software optimization are higher in the future, but the level of technical personnel required is also higher in the future. When time and personnel permit, it is the most cost-effective optimization solution. Hardware and software optimization are not mutually exclusive. When needed, both can approach the upper limit of MYSQL performance at the same time.

Highly liked sharing: MySQL optimization ideas that are in line with production

Hard Optimization - Cash Ability

Phase One
- Improvement For disk I/O, try to use SSD disks (quality improvement)
- Increase memory and increase query cache space
- Increase the number of CPU cores and increase execution threads
Phase 2
- Replace the self-built mysql with the service provider mysql service
- Enable the built-in read and write separation function
Phase 3
- The service provider’s mysql service is replaced with a cloud-native distributed database
- Enable the built-in read and write separation function
- Enable the built-in sub-table function

Soft Optimization - Query - OLTP

OLTP is mainly used to record the occurrence of certain types of business events, such as user behavior. When the behavior occurs, the system will record When and where the user did something, such a row (or multiple rows) of data will be updated in the database in the form of additions, deletions, and modifications. This requires high real-time performance, strong stability, and ensuring that the data is updated successfully in a timely manner. Common business systems all belong to OLTP, and the databases used are transaction databases, such as MySlq, Oracle, etc. For OLTP, improving query speed and service stability are the core of optimization

Highly liked sharing: MySQL optimization ideas that are in line with production

Slow query
- Discover the SQL with efficiency problems through the slow query log
Problem SQL troubleshooting direction
- There is a problem with the index design
- There is a problem with the SQL statement
- Wrong index selection for the database
- The single table is large
Explain specific analysis
- View sql execution comparison rate
- Check the index hit situation (key points)
mysql optimizer
- When the optimizer selects an index, it will refer to the index Cardinality
- The cardinality is automatically maintained and estimated by MySQL, and may not be accurate
- If the index does not hit or the wrong index is used, there is a problem with the optimizer step
- analyze can re-count index information and recalculate the base number
Force index
- The force keyword can force the use of index and forcefully specify index on the business code
Covered index - the most ideal hit index
- Covered index means that the same index (unique, ordinary, joint index, etc.) is used from the execution of the query statement to the returned result
- Covering indexes can reduce back table queries
- If the data query uses more than one index, it is not a covering index
- You can optimize the SQL statement or optimize the joint index. Use covering index
count() function
- count(non-indexed field) - covering index cannot be used, theoretically the slowest
- count(index field) - can overwrite the index, but still needs to determine whether the field is null each time
- count (primary key) - Same as above
- count(1) - only scans the index tree, there is no process of parsing the data rows , theoretically faster, but it will still determine whether 1 is null
- count(*) - MySQL specifically optimizes the count(*) function to directly return the number of data in the index tree, optimal
ORDER BY
- Minimize additional sorting and specify where condition
- The combination of where statement and ORDER BY statement satisfies the leftmost prefix
- The most efficient - Index coverage (few scenarios, low chance of encounter)
  - Index coverage can skip generating the intermediate result set and directly output the query results
  - The ORDER field needs to be indexed and consistent with the WHERE conditions and output The contents are all in the same index
Paging query
- First find a way to cover the index
- Find out what you need first The id of the data, return to the table to get the final result set
Index pushdown
- KEY store_id_guide_id (store_id,guide_id) USING BTREE
- select * from table where store_id in (1,2) and guide_id = 3;
- Before MySQL5.6, you need to use the index to query store_id in (1,2), then add all tables to verify film_id = 3
- MySQL5.6, if it can be read in the index, directly use index filtering
Loose index scan
- KEY store_id_guide_id (store_id,guide_id) USING BTREE
- select film_id from table where guide_id = 3
- New features of MySQL8.0
- Loose index scan can break the "left-hand principle" and solve the problem of losing the leading brother
- The efficiency is lower than that of the joint index
Function operation
- If you perform function operation on the index field, the optimizer will give up the index
- This situation may include: time function, converting string to number, character encoding conversion
- Optimize the use of server-side logic to replace mysql functions
The single table size is too large
- Upgrade mysql, the single table size that different mysql software can carry It is different. Based on my current experience, there is no problem in querying the hit index when the single table of Alibaba Cloud polardb cluster version is 200 million (high priority)
- Data settlement - such as pipeline data can be The latest value is obtained through settlement at a certain point in time, and the settled flow is transferred to the backup table (medium priority)
- Separation of hot and cold data - data that cannot be settled is distinguished according to the frequency of query, and the frequency is low Transfer the query to another table, and distinguish the entry point of the query (medium priority) in terms of business
- Distributed database table splitting-enable the table splitting function of the distributed database with orders, and the distributed database component management is Insertion and query after table splitting (medium priority)
- Code to implement table splitting - split a single table into multiple tables according to certain rules, after splitting in most framework ORMs of PHP and GO It is necessary to make certain modifications to the framework ORM. The ORM in JAVA has native support. It is recommended to consider it at the early stage of the project. The later it becomes more difficult (lower priority)

Soft Optimization - Write Update Delete

Highly liked sharing: MySQL optimization ideas that are in line with production

Lock
- According to granularity MySQL locks can be divided into global locks, table-level locks, and row locks
- Global lock
  - 自google/baidu
- Table-level locks are divided into table locks (data locks) and metadata locks
  - Table locks
    - 自google/baidu
  - Metadata lock
    - 自google/baidu
- ##Row lock will lock the data row, divided into Shared locks and exclusive locks
Solution to deadlock
- Parameter configuration
  - Adjust the innodb_lock_wait_timeout parameter
    - The default is 50 seconds, that is, if the lock is not acquired after waiting for 50 seconds, the current statement will report an error
    - If the waiting time expires Long, you can shorten this parameter appropriately
  - Active deadlock detection: innodb_deadlock_detect
    - Roll back less costly transactions when a deadlock is found
    - Enabled by default
- Do not open the transaction if it is not necessary
- Try to place the query outside the transaction to reduce the number of locked rows
- Avoid The transaction time is too long, do not trigger http requests in the transaction
- Actively check the transaction status
```
show  processlist;SELECT * FROM information_schema.INNODB_TRX; //长事务SELECT * FROM information_schema.INNODB_LOCKs; //查看锁SELECT * FROM information_schema.INNODB_LOCK_waits; //查看阻塞事务
```

Search business

The number of search rows is less than 100,000 - mysql is hard to carry
- Improve the cpu, io and memory hardware of mysql
The number of search rows is more than 100,000 - introduction Elasticsearch

Highly liked sharing: MySQL optimization ideas that are in line with production

Elasticsearch’s inverted index is suitable for full-text search, but the data structure has poor flexibility.

Data synchronization
- When the business code changes data, it is synchronized to Elasticsearch at the same time
- Canel subscription mysql log triggers synchronization
Elasticsearch-index
- is composed of a list of documents with the same fields - analogous to mysql's table
- Once the field type is set, modification is prohibited and new fields are allowed
- Specific The method is self-directed google/baidu
Elasticsearch-Document
- The user’s data document stored in es - analogy to the row of mysql
- is composed of metadata and Json Object composition
- Metadata and Json Object details are provided by google/baidu
Elasticsearch-Word Segmenter
- by Google/baidu
Elasticsearch-Inverted Index (Key Points)
- 自google/baidu

Statistical Business-OLAP

OLAP is used for decision-making analysis of data relative to OLTP transaction processing scenarios. It is an offline data warehouse idea used in big data analysis, not a specific technology stack. If your solution can embody the idea of OLAP analysis and processing, then the solution is OLAP.

Early data warehouse construction mainly refers to modeling and summarizing enterprise business databases such as ERP, CRM, SCM and other data into the data warehouse engine according to the requirements of decision-making analysis. Its application is mainly for reporting purposes. It is to support the decision-making of management and business personnel (medium- and long-term strategic decisions). As IT technology moves toward the Internet and mobility, data sources become more and more abundant. Unstructured data appears on the basis of the original business database, such as website logs, IoT device data, APP buried data, etc. The amount of these data It is several orders of magnitude larger than previous structured data.

No matter how the business faced by OLAP changes, it is inseparable from the following steps: Determine the analysis field->Synchronize business data to the computing library->Data cleaning modeling->Synchronize to the data warehouse- >Exposed to the outside

The calculation source database is specially used for data cleaning, and the purpose is to avoid affecting the performance of the business database during data cleaning. By cleaning the data in the calculation source database according to business and dimensions, the usability and reusability of the data are increased, and the final real-time detailed data is obtained, which is transferred to the data warehouse, and the data warehouse provides the final decision analysis data.

DEMO plan

Highly liked sharing: MySQL optimization ideas that are in line with production

Production plan

Highly liked sharing: MySQL optimization ideas that are in line with production

The software in each link can use the same functions If the software is replaced by the team's most confident software implementation solution, then the solution is OLAP.

Summary

Optimization must be down-to-earth, with capability accumulation step by step, multiple rounds of iterations, and cannot be achieved overnight. Conduct multiple rounds of iterations based on your own foundation, business scenarios and future development expectations.

The principle of iteration is to first improve the efficiency of the software through soft optimization and hard optimization of a single software service. When the optimization cost is lower than the benefit, based on future development expectations, refer to mature solutions on the market and follow the solutions. Introduce new software as needed for combined innovation, and avoid blind copying. Only through organic integration can the effects of 1 1>2, 2 1>3 be achieved. When the referenced software encounters a bottleneck, repeat this process.

Thank you for reading this. The above is all the content of the article. The optimization points and solutions proposed in the content are not necessarily the optimal solutions. They are the best practices in personal work. If you have different opinions, you are welcome to discuss them. comminicate.

The above is the detailed content of Highly liked sharing: MySQL optimization ideas that are in line with production. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:learnku. If there is any infringement, please contact admin@php.cn delete

Explain the InnoDB Buffer Pool and its importance for performance.Apr 19, 2025 am 12:24 AM

InnoDBBufferPool reduces disk I/O by caching data and indexing pages, improving database performance. Its working principle includes: 1. Data reading: Read data from BufferPool; 2. Data writing: After modifying the data, write to BufferPool and refresh it to disk regularly; 3. Cache management: Use the LRU algorithm to manage cache pages; 4. Reading mechanism: Load adjacent data pages in advance. By sizing the BufferPool and using multiple instances, database performance can be optimized.

MySQL vs. Other Programming Languages: A ComparisonApr 19, 2025 am 12:22 AM

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

Learning MySQL: A Step-by-Step Guide for New UsersApr 19, 2025 am 12:19 AM

MySQL is worth learning because it is a powerful open source database management system suitable for data storage, management and analysis. 1) MySQL is a relational database that uses SQL to operate data and is suitable for structured data management. 2) The SQL language is the key to interacting with MySQL and supports CRUD operations. 3) The working principle of MySQL includes client/server architecture, storage engine and query optimizer. 4) Basic usage includes creating databases and tables, and advanced usage involves joining tables using JOIN. 5) Common errors include syntax errors and permission issues, and debugging skills include checking syntax and using EXPLAIN commands. 6) Performance optimization involves the use of indexes, optimization of SQL statements and regular maintenance of databases.

MySQL: Essential Skills for Beginners to MasterApr 18, 2025 am 12:24 AM

MySQL is suitable for beginners to learn database skills. 1. Install MySQL server and client tools. 2. Understand basic SQL queries, such as SELECT. 3. Master data operations: create tables, insert, update, and delete data. 4. Learn advanced skills: subquery and window functions. 5. Debugging and optimization: Check syntax, use indexes, avoid SELECT*, and use LIMIT.

MySQL: Structured Data and Relational DatabasesApr 18, 2025 am 12:22 AM

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

MySQL: Key Features and Capabilities ExplainedApr 18, 2025 am 12:17 AM

MySQL is an open source relational database management system that is widely used in Web development. Its key features include: 1. Supports multiple storage engines, such as InnoDB and MyISAM, suitable for different scenarios; 2. Provides master-slave replication functions to facilitate load balancing and data backup; 3. Improve query efficiency through query optimization and index use.

The Purpose of SQL: Interacting with MySQL DatabasesApr 18, 2025 am 12:12 AM

SQL is used to interact with MySQL database to realize data addition, deletion, modification, inspection and database design. 1) SQL performs data operations through SELECT, INSERT, UPDATE, DELETE statements; 2) Use CREATE, ALTER, DROP statements for database design and management; 3) Complex queries and data analysis are implemented through SQL to improve business decision-making efficiency.

MySQL for Beginners: Getting Started with Database ManagementApr 18, 2025 am 12:10 AM

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Hot Topics

Where is the login entrance for gmail email?

7627

CakePHP Tutorial

1389

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

140