Home >Database >Mysql Tutorial >Which Index Configuration (Range Column First vs. Low Cardinality Column First) Is Optimal for Range Queries on High and Low Cardinality Columns?

Which Index Configuration (Range Column First vs. Low Cardinality Column First) Is Optimal for Range Queries on High and Low Cardinality Columns?

DDD
DDDOriginal
2024-12-16 05:49:10176browse

Which Index Configuration (Range Column First vs. Low Cardinality Column First) Is Optimal for Range Queries on High and Low Cardinality Columns?

Which Index Configuration is Optimal when Working with Range Queries Involving High and Low Cardinality Columns?

In the given scenario, we have a table 'files' with a primary key on 'did' and 'filename', and two additional indexes: 'fe' on 'filetime' and 'ext', and 'ef' on 'ext' and 'filetime'. Our query involves filtering rows based on both 'ext' and 'filetime' using range conditions.

Let's explore which index configuration is more efficient for this query.

Evaluating Index Options

To determine the optimal index, we can analyze the potential index usage and cost estimates using EXPLAIN:

Forcing fe (range column first):

EXPLAIN SELECT COUNT(*), AVG(fsize)
FROM files FORCE INDEX(fe)
WHERE ext = 'gif' AND filetime >= '2015-01-01' AND filetime < '2015-01-01' + INTERVAL 1 MONTH;

Forcing ef (low cardinality column first):

EXPLAIN SELECT COUNT(*), AVG(fsize)
FROM files FORCE INDEX(ef)
WHERE ext = 'gif' AND filetime >= '2015-01-01' AND filetime < '2015-01-01' + INTERVAL 1 MONTH;

Analysis

EXPLAIN suggests that using 'ef' (low cardinality column first) results in a more efficient execution plan compared to 'fe'. This is because 'ef' enables the optimizer to filter rows using both columns of the index, leading to a lower estimated cost.

Optimizer Trace

The Optimizer trace provides additional insights into the index evaluation process:

"potential_range_indices": [
    {
        "index": "fe",
        "usable": true
    },
    {
        "index": "ef",
        "usable": true
    }
],
"analyzing_range_alternatives": {
    "range_scan_alternatives": [
        {
            "index": "fe",
            "ranges": [...],
            "index_only": false,
            "rows": 16684,
            "cost": 20022
        },
        {
            "index": "ef",
            "ranges": [...],
            "index_only": false,
            "rows": 538,
            "cost": 646.61
        }
    ]
},
"attached_conditions_computation": [
    {
        "access_type_changed": {
            "table": "`files`",
            "index": "ef",
            "old_type": "ref",
            "new_type": "range",
            "cause": "uses_more_keyparts"
        }
    }
]

Conclusions

The Optimizer trace confirms that:

  • "fe" (range column first) uses only the first column for filtering.
  • "ef" (low cardinality column first) leverages both columns of the index for filtering.
  • The optimizer prioritizes the use of columns involved in range conditions, regardless of cardinalities, when constructing composite indexes.
  • Cardinality is less relevant for composite indexes in this type of query scenario.

Therefore, considering both EXPLAIN output and the Optimizer trace, the optimal index configuration is ef (ext, filetime) for queries involving both ext and filetime range conditions. By putting the low cardinality column first in the index, we enable the optimizer to use both columns effectively, resulting in a more efficient execution plan.

The above is the detailed content of Which Index Configuration (Range Column First vs. Low Cardinality Column First) Is Optimal for Range Queries on High and Low Cardinality Columns?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn