Home  >  Article  >  Database  >  Master MYSQL Advanced

Master MYSQL Advanced

coldplay.xixi
coldplay.xixiforward
2021-01-25 09:14:462078browse

Master MYSQL Advanced

Free learning recommendations: mysql video tutorial

Article Directory

  • 1 Foreword
    • 1.1 Database architecture
    • 1.2 Monitoring information
  • 2 Impact Database factors
    • 2.1 Ultra-high QPS and TPS
    • 2.2 Large amounts of concurrency and ultra-high CPU usage
    • 2.3 Disk IO
    • 2.4 Network card traffic
    • 2.5 Large table
      • 2.5.1 The impact of large table on query
      • 2.5.2 Large table on DDL Impact of operations
      • 2.5.3 How to handle large tables in the database
    • 2.6 Large transactions
      • ##2.6.1 What Is a transaction
      • 2.6.2 ATOMICITY of a transaction
      • 2.6.3 Consistency of a transaction(CONSISTENCY)
      • 2.6.4 Isolation of a transaction(ISOLATION)
      • 2.6.5 Durability of transactions (DURABILITY)
      • 2.6.7 What is a large transaction

1 Preface

A large part of the pressure on the server comes from the performance of the database. If there is no stable database and server environment, the server is prone to some failures or even downtime, resulting in The consequences are also immeasurable, so database performance optimization is essential.

1.1 Database architecture

The general database architecture is a master server, which is equipped with several or a dozen slave servers for master-slave synchronization. When the master server After a crash, the programmer needs to manually select a slave server with the latest data to take over the master server, and then synchronize the new master server. However, sometimes because there are many slave servers, this process is quite time-consuming, and this process is also a challenge to the capacity of the network card.

1.2 Monitoring information

QPS & TPS: The higher the value, the better.

Concurrency: The number of requests processed at the same time.
CPU usage: the lower, the better.
Disk IO: The higher the read and write performance, the better.
Note: Generally, companies are advised not to perform database backups on the main database before and after major promotions, or cancel such plans before large-scale events, because this will seriously damage the performance of the server.

2 Factors affecting the database

There are many factors that affect the database, such as: sql query speed, server hardware, network card traffic, disk IO, etc., we will do this later I will go into details. Here is an introduction to some of the information fed back to us in the monitoring information and how we should optimize it.

2.1 Ultra-high QPS and TPS

Due to inefficient SQL, there are often risks of ultra-high QPS and TPS. During a general promotion period, the number of visits to the website will be greatly increased, and the QPS and TPS of the database will also be increased.

What is QPS: the number of queries processed per second. For example, if we have a CPU and can process one SQL in 10ms, then we can process 100 SQLs in 1 second, QPS<=100. However, if we only process one SQL in 100ms, then we can process 10 SQLs in 1 second, and QPS< =10, these two situations are very different, so try to optimize SQL performance.

2.2 Massive concurrency and ultra-high CPU usage

What risks will this cause?

Under a large amount of concurrency, the number of database connections may be full. In this case, try to set the database parameter
max_connections to a larger value (the default value is 100). If this is exceeded, When the value is reached, a 500 error will be reported. Under extremely high CPU usage, downtime may occur due to exhaustion of CPU resources.

2.3 Disk IO

One of the bottlenecks of the database is disk IO, which will bring the following risks:

    Disk Sudden drop in IO performance
  1. This often occurs when the hot data is larger than the server's available memory. Usually this situation can only be solved by using a faster disk device.
  2. Other scheduled tasks that consume a lot of disk performance
  3. We also mentioned this above. It is best to avoid backing up the main database before the big promotion, or perform it on the slave server, and adjust the scheduled tasks. Do a good job in disk maintenance.

2.4 Network card traffic

Obviously, excessive network card traffic causes the risk of the network card IO being full.

The general network card is a Gigabit network card (1000Mb/8 ≈ 100MB)
If the number of connections exceeds the maximum capacity of the network card, the service will be unable to connect. So how should we avoid being unable to connect to the database? Condition:

  1. Reduce the number of slave servers
    Because each slave server must copy logs from the master server, the more slave servers, the greater the network card traffic.
  2. Perform hierarchical caching
    Be sure to avoid the sudden increase in back-end visits caused by the sudden failure of the front-end cache.
  3. Avoid using select * for query
    This is the most basic method of database optimization. Querying out unnecessary fields will also consume a lot of network traffic.
  4. Separate business network and server network
    This can avoid master-slave synchronization or network backup from affecting network performance.

2.5 Big table

What kind of table can be called a big table? In fact, it is all relative, and there will be different restrictions for different storage engines. Data storage like NoSQL does not limit the number of rows in the table. In theory, it can be stored as long as the disk capacity allows. But when the number of rows in a table exceeds tens of millions of rows, it will have a great impact on the performance of the database. Then we can summarize the characteristics of large tables:

  • The number of record rows is huge, and a single table exceeds tens of millions of rows
  • The table data file is huge, and the table data file exceeds 10G

But even if it meets the above characteristics, it may not have a big impact on the performance of our database, so this statement is relative. For example, like the log table of a general database, even if the number of record rows is large, the file The size is large, but we generally only add and query it, and do not involve a large number of i modification and deletion operations, so it will not have a great impact on database performance.
But when one day due to business changes, it is necessary to add columns to this 10G log table, then the amount of work will undoubtedly be huge.

2.5.1 The impact of large tables on queries

Large tables often represent the occurrence of slow queries. Slow queries mean that it is difficult to filter within a certain period of time. Output the required data.
For example, in a log table with tens of millions of data, there is a field called order source, which records the platform on which the order was generated. If the business does not need it at the beginning, it will not have an impact on the database performance. However, due to business needs later, it is necessary to check the order volume of which platform these tens of millions of data come from. This is a problem. Big question.
Because there are only a few source channels for these orders, the distinction is very low, so querying certain data among tens of millions of data will consume a lot of disk IO and seriously reduce the efficiency of the disk. Every time a user views an order, the source of the order will be queried from the database, which will generate a large number of slow queries.

2.5.2 The impact of large tables on DDL operations

The impact of large tables on DDL operations, what risks does this bring to us?

  1. It takes a long time to create an index
    Before MySQL version 5.5, the database would lock the table when creating the index, but after version 5.5, the table would not be locked. , but since MySQL's replication mechanism is executed on the new host before it can be sent to the slave through logs, this will cause a long master-slave delay and affect normal business.
  2. Modifying the table structure requires locking the table for a long time
    Locking the table during the process of modifying the table structure will cause us the risk of long master-slave delay. Due to the master-slave replication mechanism of our MySQL, all table structure operations are often performed on the host first and then transmitted to the slave through log mode for the same operation, and then the master-slave replication of the table structure is completed.
    Suppose we modify the structure of a table, and the modification time on the master server is 480s, then our modification time on the slave database is also 480s. Since MySQL currently uses a single thread for master-slave replication, once a large table is modified, other data modification operations cannot be performed until the relevant operations are completed on the slave server. Therefore, this will cause a master-slave delay of at least 480s. .
    At the same time, it will affect the normal operation of data, which will cause all insertion operations to be blocked, the number of connections will increase significantly and fill up the server, which will cause a 500 connection error on the server.

2.5.3 How to deal with large tables in the database

  1. Sub database and table, divide a large table into multiple small tables
    Difficulty:
    1. Selection of primary keys for table partitions
    2. Queries and statistics of cross-partition data after partitioning the tables
  2. Historical data archiving of large tables
    Function: Reduce the impact on front-end and back-end business
    Difficulty:
    1. Selection of archiving time point
    2. How to perform archiving operations

2.6 Big transactions

2.6.1 What is a transaction

  1. Transactions are database systems that are different from all other file systems One of the important characteristics.

  2. A transaction is a set of atomic SQL statements or an independent unit of work.
    Therefore, transactions need to comply with the following four characteristics: atomicity, consistency, isolation, and durability.

2.6.2 ATOMICITY of transactions

Definition: A transaction must be regarded as an indivisible minimum unit of work. All operations in the entire transaction are either submitted successfully or all fail. For a transaction, it is impossible to perform only part of the operations.
Example:
A wants to transfer 1,000 yuan to B. When 1,000 yuan is withdrawn from A's account, A's balance on the database is subtracted by 1,000, but when it is added to B's balance, the server fails. Then A The 1,000 yuan needs to be returned to A's account to maintain the atomicity of the transaction, either succeeding together or failing together.

2.6.3 Transaction consistency (CONSISTENCY)

Definition: Consistency means that a transaction converts the database from one consistency state to another consistency State, the integrity of the data in the database is not compromised before the transaction begins and after the transaction ends.
Example:
A's 1000 blocks in the bank are transferred to B, A's balance is 0, and B's account balance changes from 0 to 1000, but from beginning to end, A B = 1000 (A's balance) 0 ( B's balance) = 0 (A's balance) 1000 (B's balance), that is to say, the total balance of A and B remains unchanged, still 1,000 yuan from beginning to end.

2.6.4 Transaction Isolation (ISOLATION)

Definition: Isolation requires that a transaction modify the data in the database when it is not submitted to other transactions. Invisible.
Example:
In the bank, A withdraws 500 yuan from the balance of 1,000 yuan. Before the withdrawal transaction is submitted, a transaction is executed to query the balance of account A. The result of the query is still the balance of 1,000 yuan, because in Before the withdrawal transaction is submitted, its transaction process is invisible to other businesses.

  • Four isolation levels defined in the SQL standard

    • READ UNCOMMITED

      • Uncommitted transactions are visible to the outside world, which is what we often call dirty reading. The queried data is called dirty data.
    • READ COMMITED

      • The default isolation level in a lot of data can only be read after the transaction is committed, and also That is, the transaction is not visible to the outside world.
    • Repeatable Read (REPEATABLE READ)

      • A level higher than committed read, in the isolation level transaction of repeatable read In an uncommitted transaction, the data in the table is queried, and in another transaction a piece of data is inserted into the table and submitted. However, when returning to the uncommitted transaction, the data in the table is queried again, and the data in the table is queried again. The result is the same, and the piece of data just inserted is not queried.
      • But you can find the data just now in the read-committed isolation level.
      • View the isolation level statement of the current database:
        show variables like '% iso%'
      • Modify the current database isolation level statement:
        set session tx_isolation='read-committed'
    • SERIALIZABLE

      • The highest isolation level. To put it simply, every piece of data read is locked, which may cause a large number of lock timeouts and lock occupation problems. Therefore, in actual business, we rarely use this isolation level. We will only consider using this isolation level unless data consistency is strictly required and no concurrency is acceptable.
    • ##Isolation:

        Uncommitted Read< Committed Read< Repeatable Read< Serializable
    • Concurrency:

        Uncommitted Read> Committed Read> Repeatable Read> Serializable

      • 2.6.5 Transaction Durability (DURABILITY)

        Definition: Once a transaction is committed, its modifications will be permanent Save to database. Even if the system crashes at this time, the modified data that has been submitted will not be lost (excluding physical factors such as disk damage).

        Example:
        User A in the bank deposits 1,000 yuan into the account. After the transaction is submitted, even if the bank system crashes, after recovery, unless A operates on the balance, the 1,000 yuan in account A will not be It will change, this is the durability of the transaction.

        2.6.7 What is a big deal

        Having said so much, what is a big deal?

        Large transactions refer to transactions that take a long time to run and operate a lot of data. For example, there is a financial product that counts each user's income every day. If it needs to count the income of all users and update it to the user's balance, hundreds of millions of updates will take hours. If If there is a rollback due to a failure in the middle, the time required for the transaction will be even more immeasurable. Not included in the update process, the user's balance will be locked, causing the problem that all users will be unable to use the balance.

        • What risks will large transactions cause:
          Lock too much data, causing a lot of blocking and lock timeouts
        1. The time required for rollback is relatively long
        2. The execution time is long and it is easy to cause master-slave delay
        • How to avoid large transactions?
          Avoid processing too much data at once.
        1. Move out unnecessary SELECT operations in transactions.
        If you can do the above two points, you can basically avoid the occurrence of big affairs.

        More related free learning recommendations: mysql tutorial(Video)

        The above is the detailed content of Master MYSQL Advanced. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete