Home  >  Article  >  Database  >  Share a MySQL multi-column index optimization example code

Share a MySQL multi-column index optimization example code

零下一度
零下一度Original
2017-04-22 15:44:311113browse

As the data captured by crawlers continues to increase, the database and query statements have been continuously optimized in the past two days. One of the table structures is as follows:

CREATE TABLE `newspaper_article` (
  `id` varchar(50) NOT NULL COMMENT '编号',
  `title` varchar(190) NOT NULL COMMENT '标题',
  `author` varchar(255) DEFAULT NULL COMMENT '作者',
  `date` date NULL DEFAULT NULL COMMENT '发表时间',
  `content` longtext COMMENT '正文',
  `status` tinyint(4) DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `idx_status_date` (`status`,`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='文章表';

According to business needs, the idx_status_date index has been added, which is particularly time-consuming when executing the following SQL:

SELECT id, title, status, date FROM article WHERE status > -2 AND date = '2016-01-07';

According to observations, the number of new data added every day is approximately within 2,500. I thought that a specific date was specified here '2016-01-07', and the actual amount of data that needs to be scanned should be within 2500, but this is not the case:
Share a MySQL multi-column index optimization example code
A total of 185,589 pieces of data were actually scanned, which was much higher than the estimated 2,500 pieces, and the actual execution time was nearly 3 seconds:

Share a MySQL multi-column index optimization example code

Why is this?

Solution

After changing idx_status_date (status, date) to idx_status (status), check the MySQL execution plan:

Share a MySQL multi-column index optimization example code

You can see that after changing the multi-column index to a single-column index, there is no change in the total amount of data to be scanned in the execution plan. Combined with the fact that multi-column indexes follow the leftmost prefix principle, it is speculated that the above query statement only uses the index of the leftmost status of idx_status_date.

I flipped through "High Performance MySQL" and found the following passage, which confirmed my idea:

If there is a range query for a certain column in the query, then the right side of All columns cannot be looked up using index optimization. For example, there is a query WHERE last_name = 'Smith' AND first_name LIKE 'J%' AND dob = '1976-12-23'. This query can only use the first two columns of the index, because here LIKE is a range condition (but the server can use the remaining columns for other purposes). If the number of range query column values ​​is limited, you can replace the range condition by using multiple equal conditions.

Therefore, there are two solutions here:

  • You can replace the range condition by using multiple equal conditions

  • Modify idx_status_date (status, date) to index idx_date_status (date, status) and create a new idx_status index to achieve the same effect.

Optimized execution plan:

Share a MySQL multi-column index optimization example code

##Actual execution result:

Share a MySQL multi-column index optimization example code

Summary

When people talk about indexes, if they don’t specify the type, they are probably talking about

B-Tree indexes. It uses B-Tree data structure to store data. We use the term "B-Tree" because MySQL also uses this keyword in CREATE TABLE and other statements. However, the underlying storage engine may also use different storage structures. InnoDB uses B+Tree. Suppose there is the following data table:

CREATE TABLE People (
  last_name  varchar(50)    not null,
  first_name varchar(50)    not null,
  dob        date           not null,
  gender     enum('m', 'f') not null,
  key(last_name, first_name, dob)
);

B-Tree index is valid for the following types of queries

  • Full value matching

    Full value matching refers to Match all columns in the index. For example, the index in the above table can be used to find people named Cuba Allen and born on 1960-01-01.

  • Match the leftmost prefix

    The index in the above table can be used to find all people with the last name of Allen, that is, only the first column of the index is used.

  • Match column prefix

    Only matches the beginning of the value of a column. For example, the index in the above table can be used to find all people whose last names begin with J. Only the first column of the index is used here.

  • Matching range values

    For example, the index in the above table can be used to find people with last names between Allen and Barrymore. Only the first column of the index is used here.

  • Exactly match a certain column and range match another column

    The index in the above table can also be used to find all people whose last name is Allen and whose first name starts with the letter K (such as Kim, Karl, etc.) people. That is, the first column last_name matches completely, and the second column first_name matches the range.

  • Query that only accesses the index

    B-Tree can usually support "query that only accesses the index", that is, the query only needs to access the index without accessing the data rows.

Some limitations of B-Tree index

  • If you do not start searching according to the leftmost column of the index, you cannot use the index. For example, the index in the above table cannot be used to find people named Bill, nor can it be used to find people with a specific birthday, because neither column is the leftmost data column. Similarly, there is no way to find people whose last names end with a certain letter.

  • Columns in the index cannot be skipped. That is, the index on the table above cannot be used to find people with the last name Smith who were born on a specific date. If you do not specify a name (first_name), MySQL can only use the first column of the index.

  • If there is a range query for a certain column in the query, all columns to the right of it cannot be searched using index optimization. For example, there is a query WHERE last_name = 'Smith' AND first_name LIKE 'J%' AND dob = '1976-12-23'. This query can only use the first two columns of the index, because here LIKE is a range condition (but the server can use the remaining columns for other purposes). If the number of range query column values ​​is limited, you can replace the range condition by using multiple equal conditions.


The above is the detailed content of Share a MySQL multi-column index optimization example code. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn