Home  >  Article  >  Database  >  MySQL architecture

MySQL architecture

巴扎黑
巴扎黑Original
2017-06-23 14:50:581043browse

Review the first chapter of MySQL Architecture and History of "High PerformanceMySQL"

1.1 MySQL Logical Architecture

Reference

Figure 1-1: MySQL server logical architecture diagram

The top-level services are not unique to MySQL. Most of them are network-based. Client/server tools or services have a similar architecture. Such as connection processing, authorization authentication, security, etc.

The second layer architecture is the more interesting part of MySQL. Most of MySQL's core service functions are in this layer, including query parsing, analysis, optimization, caching, and all built-in functions (such as date, time, math, and encryption functions). All cross-storage engine functions are in this layer. Layer implementation: stored procedures, triggers, views, etc.

The third layer contains the storage engine. The storage engine is responsible for the storage and retrieval of data in MySQL. Like various file systems under GNU/Linux, each storage engine has its advantages and disadvantages. The server communicates with the storage engine through APIs. These interfaces shield the differences between different storage engines, making these differences transparent to the upper-layer query process. The storage engine API contains dozens of low-level functions for performing operations such as "starting a transaction" or "extracting a row of records based on the primary key." However, the storage engine will not parse SQL, and different storage engines will not communicate with each other, but will simply respond to the request of the upper server.

1.2 Concurrency control

1.2.1 Read-write lock

These two types of locks are usually called shared locks (shared lock) and exclusive locks (exclusive lock) , also called read lock (read lock) and write lock (write lock). Read locks are shared, or non-blocking. Multiple clients can read the same resource at the same time without interfering with each other. Write locks are exclusive, which means that a write lock blocks other write locks and read locks.

1.2.2 Lock granularity

The two most important lock strategies: table locks and row-level locks

Table lock(table lock)

Table Locks are the most basic locking strategy in MySQL and are the least expensive strategy. It will lock the entire table. Before a user can perform write operations (insert, delete, update, etc.) on the table, he needs to obtain a write lock, which will block all read and write operations on the table by other users. Only when there is no write lock, other reading users can obtain the read lock, and read locks do not block each other.

In certain scenarios, table locks may also have good performance. For example, READ LOCAL table locks support certain types of concurrent write operations. In addition, write locks also have a higher priority than read locks, so a write lock request may be inserted in front of the read lock queue (a write lock can be inserted in front of a read lock in the lock queue, whereas a read lock cannot be inserted) to the front of the write lock).

Row-level lock (row lock)

Row-level lock can support concurrent processing to the greatest extent (it also brings the greatest lock overhead). As we all know, row-level locking is implemented in InnoDB and XtraDB, as well as some other storage engines. Row-level locks are only implemented at the storage engine layer, but not at the MySQL server layer. The server layer has no knowledge of the lock implementation in the storage engine.

1.3 Transactions

Transactions support the ACID principle.

Atomicity

A transaction must be viewed as an indivisible minimum unit of work.

Consistency (consistency)

The database always transitions from one consistent state to another consistent state.

Isolation

Generally speaking, modifications made by one transaction are not visible to other transactions before they are finally committed.

Durability

Once a transaction is committed, the modifications made will be permanently saved in the database.

1.3.1 Isolation Level

The following is a brief introduction to the four isolation levels.

READ UNCOMMITTED (uncommitted read)

At the READ UNCOMMITTED level, modifications in a transaction are visible to other transactions even if they are not committed. Transactions can read uncommitted data, which is also called dirty read. This level can cause a lot of problems. In terms of performance, READ UNCOMMITTED is not much better than other levels, but it lacks many of the benefits of other levels. Unless there are really necessary reasons, it is rarely used in practical applications. .

READ COMMITTED

The default isolation level of most database systems is READ COMMITTED (but not MySQL). Any modifications made by a transaction from the beginning until it is committed are not visible to other transactions. This level is sometimes called nonrepeatable read, because executing the same query twice may result in different results.

REPEATABLE READ (repeatable read)

REPEATABLE READ solves the problem of dirty reading. This level ensures that the results of reading the same record multiple times in the same transaction are consistent. However, in theory, the repeatable read isolation level still cannot solve another phantom read (Phantom Read) problem. The so-called phantom read means that when a transaction reads records in a certain range, another transaction inserts a new record in the range. When the previous transaction reads the records in the range again, it will Produce Phantom Row. InnoDB and XtraDB storage engines solve the problem of phantom reads through multiversion concurrency control (MVCC).

Repeatable read is MySQL’s default transaction isolation level.

SERIALIZABLE (Serializable)

SERIALIZABLE is the highest isolation level. It avoids the phantom read problem mentioned earlier by forcing transactions to be executed serially. Simply put, SERIALIZABLE will lock every row of data retrieved, so it may cause a lot of timeouts and lock contention problems. This isolation level is rarely used in actual applications. This level should only be considered when it is very necessary to ensure data consistency and no concurrency is acceptable.

1.3.2 Deadlock

Deadlock refers to two or more transactions occupying each other on the same resource and requesting to lock the resources occupied by each other, resulting in The phenomenon of vicious circle. Deadlocks can occur when multiple transactions try to lock resources in different orders. Deadlock can also occur when multiple transactions lock the same resource at the same time.

In order to solve this problem, the database system implements various deadlock detection and deadlock timeout mechanisms. More complex systems, such as the InnoDB storage engine, are more capable of detecting deadlock circular dependencies and returning an error immediately. This solution is very effective, otherwise deadlock will lead to very slow queries. Another solution is to give up the lock request when the query time reaches the lock wait timeout setting. This method is generally not good. InnoDB's current method of dealing with deadlocks is to roll back the transaction holding the least row-level exclusive lock (this is a relatively simple deadlock rollback algorithm).

The behavior and order of locks are related to the storage engine. Executing statements in the same order, some storage engines will produce deadlocks, while others will not. Deadlocks occur for twofold reasons: some are due to real data conflicts, which are often difficult to avoid, but some are entirely due to the way the storage engine is implemented.

1.3.3 Transaction log

Using the transaction log, the storage engine only needs to modify its memory copy when modifying the table data, and then record the modification to the transaction log that is persisted on the hard disk. instead of persisting the modified data itself to disk each time. The transaction log is appended. After the transaction log is persisted, the modified data in the memory can be slowly flushed back to the disk in the background. At present, most storage engines are implemented in this way, which we usually call write-ahead logging (Write-Ahead Logging). Modifying data requires writing to the disk twice.

If the data modification has been recorded in the transaction log and persisted, but the data itself has not been written back to the disk, and the system crashes, the storage engine can automatically restore this modified data when restarting. The specific recovery method depends on the storage engine.

1.3.4 Transactions in MySQL

1.4 Multi-version concurrency control

The implementation of MVCC is achieved by saving a snapshot of the data at a certain point in time. In other words, no matter how long it takes to execute, the data seen by each transaction is consistent. Depending on the time when the transaction starts, the data seen by each transaction on the same table at the same time may be different. Below we illustrate how MVCC works through a simplified version of InnoDB's behavior.

InnoDB’s MVCC is implemented by saving two hidden columns behind each row of records. Of these two columns, one holds the creation time of the row, and the other holds the expiration time (or deletion time) of the row. Of course, what is stored is not the actual time value, but the system version number. Every time a new transaction is started, the system version number is automatically incremented. The system version number at the start of the transaction will be used as the version number of the transaction, which is used to compare with the version number of each row of records queried. Let's take a look at how MVCC operates specifically under the REPEATABLE READ isolation level.

SELECT

InnoDB will check each row of records based on the following two conditions:

a. InnoDB only looks for data rows whose version is earlier than the current transaction version (that is, the system version number of the row is less than or equal to the system version number of the transaction). This ensures that the rows read by the transaction either already exist before the transaction starts. Either inserted or modified by the transaction itself.

b. The deleted version of the row is either undefined or greater than the current transaction version number. This ensures that the rows read by the transaction were not deleted before the transaction started.

Only records that meet the above two conditions can be returned as query results.

INSERT

InnoDB saves the current system version number as the row version number for each newly inserted row.

DELETE

​ InnoDB saves the current system version number as the row deletion identifier for each deleted row.

UPDATE

​ InnoDB inserts a new row of records, saves the current system version number as the row version number, and saves the current system version number to the original row as the row deletion identifier.

Save these two additional system version numbers so that most read operations can be done without locking. This design makes the data reading operation very simple, the performance is very good, and it also ensures that only rows that meet the standards are read. The disadvantages are that each row of records requires additional storage space, more row checking, and some additional maintenance work.

MVCC only works under two isolation levels: REPEATABLE READ and READ COMMITTED. The other two isolation levels are incompatible with MVCC Note 4 because READ UNCOMMITTED always reads the latest data row, not the data row that conforms to the current transaction version. SERIALIZABLE will lock all rows read.

1.5 MySQL storage engine

In the file system, MySQL saves each database (also called schema) as a subdirectory under the data directory. When creating a table, MySQL will create a .frm file with the same name as the table in the database subdirectory to save the table definition. For example, when you create a table named MyTable, MySQL will save the definition of the table in the MyTable.frm file. Because MySQL uses file system directories and files to store database and table definitions, case sensitivity is closely related to the specific platform. In Windows, case is insensitive; in Unix-like cases it is case sensitive. Different storage engines save data and indexes in different ways, but the definition of tables is handled uniformly in the MySQL service layer.

You can use the SHOW TABLE STATUS command (in versions after MySQL 5.0, you can also query the corresponding table in INFORMATION SCHEMA) to display table-related information. For example, for the user table in the mysql database:

mysql> SHOW TABLE STATUS LIKE 'user' \G

Name: user

Engine: MyISAM

Row_format: Dynamic

Rows: 6

Avg_row_length: 59

Data length: 356

Max data length: 4294967295

Index length : 2048

Data_free: 0

Auto_increment: NULL

Create_time: 2002-01-24 18:07:17

Update_time : 2002 -01- 24 21: 56 : 29

Check_time: NULL

Collation: ut f8_bin

Checksum: NULL

Create_options:

Comment: Users and global privileges

1 row in set (o.oo sec)

The results show that this is a MyISAM table. There is a lot of other information as well as statistics in the output. Below is a brief introduction to the meaning of each line.

Name

##Table name.

Engine

The storage engine type of the table. In older versions, the column's name was Type, not Engine.

Row- format

The format of the row. For MyISAM tables, the optional values ​​are Dynamic, Fixed, or Comp ressed. Dynamic's row length is variable and generally contains variable-length fields, such as VARCHAR or BLOB. Fixed row length is fixed and only contains fixed-length columns, such as CHAR and INTEGER. Compressed rows only exist in compressed tables.

Rows

The number of rows in the table. For MyISAM and some other storage engines, the value is precise, but for InnoDB it is an estimate.

Avg_ row_length

The average number of bytes contained in each line.

Data_length

The size of the table data in bytes.

Max- data_length

The maximum capacity of table data. This value is related to the storage engine.

Index_length

The size of the index in bytes.

Data_free

For MyISAM tables, indicates the space that has been allocated but is not currently used. This part of the space includes previously deleted rows and the space that can be used by INSERT later.

Auto_increment

The value of the next AUTO INCREMENT.

Create_time

The creation time of the table.

Update_time

The last modification time of table data.

Check_ time

The time when the table was last checked using the CKECK TABLE command or the myisamchk tool.

Collation

The default character set and character column collation of the table.

Checksum

If enabled, the real-time checksum of the entire table is saved.

Create_options

Other options specified when creating the table.

Comment

This column contains some other additional information. For MyISAM tables, the comments that were included when the table was created are saved. For InnoDB tables, the remaining space information of the InnoDB table space is saved. If it is a view, this column contains the text word "VIEW".

1.6 MySQL Timeline

1.7 MySQL Development Model

Reference: "High Performance MySQL"

The above is the detailed content of MySQL architecture. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn