Home >Database >Mysql Tutorial >Which MySQL Index Is More Efficient for Range Queries: Leading with High or Low Cardinality Column?

Which MySQL Index Is More Efficient for Range Queries: Leading with High or Low Cardinality Column?

Barbara Streisand
Barbara StreisandOriginal
2024-11-29 02:55:14578browse

Which MySQL Index Is More Efficient for Range Queries: Leading with High or Low Cardinality Column?

Index Optimization for Range Queries

Columns with higher cardinality contribute to more efficient indices in MySQL. However, in the case of range queries, an exception applies.

Problem Statement

Consider a table with the following structure:

CREATE TABLE `files` (
  `did` int(10) UNSIGNED NOT NULL DEFAULT '0',
  `filename` VARBINARY(200) NOT NULL,
  `ext` VARBINARY(5) DEFAULT NULL,
  `fsize` DOUBLE DEFAULT NULL,
  `filetime` DATETIME DEFAULT NULL,
  PRIMARY KEY (`did`,`filename`),
  KEY `fe` (`filetime`,`ext`),          -- Option 1
  KEY `ef` (`ext`,`filetime`)           -- Option 2
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Filetimes are distinct, while there are a limited number of ext values (i.e., higher cardinality for filetime, lower cardinality for ext). A query involves both columns with the following conditions:

WHERE ext = '...'
AND filetime BETWEEN ... AND ...

Which index, fe or ef, is more optimal?

Answer

Surprisingly, the index with ext as the first column, despite its lower cardinality, is more efficient for this query.

Explanation

MySQL's optimizer analyzes index alternatives and chooses the one with the lowest cost. Using the optimizer trace, we can observe the reasoning behind this choice.

For fe (filetime first), MySQL estimates that it would need to scan 16684 rows to find 'gif' files, even with the range condition on filetime.

For ef (ext first), however, it estimates that it can use both index columns and quickly drill down to the appropriate rows, resulting in a cost of only 646.61. MySQL chooses this index as it can use more key parts, making the search more efficient.

Conclusions

  • Prioritize columns involved in equality tests in the index, regardless of cardinality.
  • The query plan will not extend beyond the first 'range' column.
  • In this context, cardinality is irrelevant for composite indices and range queries.
  • InnoDB can utilize index columns beyond those used for filtering ("Using index condition").

The above is the detailed content of Which MySQL Index Is More Efficient for Range Queries: Leading with High or Low Cardinality Column?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn