The starting point of writing this article is to record my accumulated experience in dealing with data problems at work. As I write and write, I find that every point will lead to Other background knowledge should be provided, such as when optimizing indexes, you need to have a certain understanding of slow query, Explain and other related functions. For example, introducing Elasticsearch requires solving data synchronization, learning Elasticsearch knowledge, etc. Due to the length of the article, it is impossible to cover every point. They are all detailed like video tutorials, and I can only summarize them based on my limited knowledge and some general points. Even so, the length of the article is already very long. If you are interested in a certain point, please go to Baidu/Google for in-depth knowledge of individual details.
The article is quite long, so if you are interested, you may want to read it through. I hope you haven’t wasted dozens of minutes. [Recommended learning: "mysql video tutorial"]
Database technology has gone through the manual management stage and the file system stage so far. and database system stage.
In the early days when there was no software system, real-world operations of a certain business could also be realized through the manual management stage of manual accounting and verbal agreements. This form existed for a long time and was relatively inefficient. a plan. In the next stage, with the development of computer technology, there was a file system stage that replaced manual accounting with excel tables, which improved productivity to a certain extent. In the software system stage, which is a database system with simple operation and high efficiency, productivity has been improved again, specific problems in the real world are abstracted into data, and real-world business is represented through the flow and change of data. In software systems, data storage is generally composed of a relational database and multiple non-relational databases.
The database is strongly related to the system business. This requires the product manager to understand the data storage and query process when designing the business. At the beginning of the design, it is clear what impact the business changes will have on the database and whether A new technology stack needs to be referenced. For example, a business designed by the product manager is to conduct statistical analysis and summary of data on multiple MySQL tables with a single table volume of millions. If MySQL multi-table query is directly used, slow queries will definitely occur and cause the msyql service to go down. In this case, the solution is Either compromise on the product side or change the technology stack.
In the system architecture and database solution, we should choose the one that is more suitable for the company's team capabilities. In the early stage of the system, simple database optimization with banknote capabilities will be the most cost-effective solution, but when it comes to mysql database banknote capabilities, there is nothing we can do. , introducing software services that focus on key functions will become the most cost-effective solution. How to choose the appropriate solution when encountering problems is the time to reflect your value.
A poor boy falls in love with a rich girl. The short-term sweetness cannot match the real class inequality. The happy ending only exists in the fantasy of the poor boy and the TV series of Teacher Qiong Yao.
How to improve the performance of data storage at a limited cost is the central idea of this article.
I believe that everyone will often come into contact with the following content in their daily work. Let me briefly summarize it.
Relational database is a data organization composed of two-dimensional tables and the relationships between them, providing transaction data consistency, Functions such as data persistence are the core storage services of software systems. They are the databases we most often come into contact with during development and interviews. For some small outsourcing projects, one MySQL is enough to meet all business needs. It is something that we often come into contact with, and it is actually full of tricks. We will discuss the tricks in detail in the following chapters.
Advantages:
Problems
MySQL database, as a relational data storage software, has advantages and obvious disadvantages. Therefore, when the data volume of the software system continues to expand and the business complexity continues to increase, We cannot expect to solve all problems by enhancing the capabilities of the MySQL database. Instead, we need to introduce other storage software and use various types of NoSQL to solve the problems of the software system's expanding data volume and increasing business complexity.
Relational database is an optimization of relational database in different scenarios. It does not mean that everything will be fine if you introduce some kind of NoSQL. It means that you should fully understand the types and application difficulties of NoSQL on the market and choose the appropriate storage in the appropriate scenario. Software is the way to go.
In business, the contents of certain tables are often queried, but most of the query results remain unchanged, so Key-value storage software, mainly Memcached and Redis, has emerged and is widely used in cache modules in the system. Redis has more data structures and persistence than Memcached, making it the most widely used among KV-type NoSQL.
In the scenario of full-text search, query optimization of MySQLB tree index, like query cannot hit the index, and every like keyword query is one time Full table scan can be supported in tables with tens of thousands of data, but slow queries will occur when the data is at the end. If the business code is not well written and the Like query is called in the transaction, a read lock will occur. ElasticSearch, with inverted index as its core, can perfectly meet the scenario of full-text search. At the same time, ElasticSearch also supports massive data very well, and the documentation and ecology are also very good. ElasticSearch is a representative product of search type.
Document type NoSql refers to a type of NoSql that stores semi-structured data as documents. Document type NoSql usually stores data in JSON or XML format. , so document-type NoSql does not have Schema. Since there is no Schema feature, we can store and read data at will. Therefore, the emergence of document-type NoSql solves the problem of inconvenient expansion of relational database table structures. The author has never used
For enterprises of a certain size, the business often involves some real-time and flexible data summary, which is not suitable for this kind of business Use the solution of calculating in advance to solve the problem. Even if you can write the business using the solution of calculating and summarizing in advance, as the number of summarized data increases, the final step of accumulating the summarized data will gradually become very slow. Column-based NoSQL is the product of this scenario. It is one of the most representative technologies in the big data era. The most common one is HBase, but the application of HBase is very heavy and often requires a complete set of Hadoop ecosystem to run. The author's company Alibaba Cloud's AnalyticDB is used, a column storage software compatible with MySql query statements. The powerful query capabilities of summary column storage software are sufficient to support various real-time and flexible data summary services.
Taking 2021 as the time node, most systems start with the following plan in the early stage. Next, I will use this case Make some adjustments slowly.
#The benefits brought by hardware upgrades are lower as time goes by. This is the fastest optimization solution when time and personnel are tight. The benefits brought by software optimization are higher in the future, but the level of technical personnel required is also higher in the future. When time and personnel permit, it is the most cost-effective optimization solution. Hardware and software optimization are not mutually exclusive. When needed, both can approach the upper limit of MYSQL performance at the same time.
Phase One
Phase 2
Phase 3
OLTP is mainly used to record the occurrence of certain types of business events, such as user behavior. When the behavior occurs, the system will record When and where the user did something, such a row (or multiple rows) of data will be updated in the database in the form of additions, deletions, and modifications. This requires high real-time performance, strong stability, and ensuring that the data is updated successfully in a timely manner. Common business systems all belong to OLTP, and the databases used are transaction databases, such as MySlq, Oracle, etc. For OLTP, improving query speed and service stability are the core of optimization
store_id_guide_id
(store_id
,guide_id
) USING BTREEstore_id_guide_id
(store_id
,guide_id
) USING BTREELock
According to granularity MySQL locks can be divided into global locks, table-level locks, and row locks
Global lock
Table-level locks are divided into table locks (data locks) and metadata locks
show processlist;SELECT * FROM information_schema.INNODB_TRX; //长事务SELECT * FROM information_schema.INNODB_LOCKs; //查看锁SELECT * FROM information_schema.INNODB_LOCK_waits; //查看阻塞事务
Elasticsearch’s inverted index is suitable for full-text search, but the data structure has poor flexibility.
The above is the detailed content of Highly liked sharing: MySQL optimization ideas that are in line with production. For more information, please follow other related articles on the PHP Chinese website!