search
HomeDatabaseMysql Tutorialmysql中 innodb表的count()优化

作/译者:叶金荣(imysql#imysql.com),来源: http://imysql.com,欢迎转载。 起因:在innodb表上做count(*)统计实在是太慢了,因此想办法看能不能再快点。 现象:先来看几个测试案例,如下 一、 sbtest 表上的测试 show create table sbtest\G***********

作/译者:叶金荣(imysql#imysql.com>),来源: http://imysql.com,欢迎转载。

起因:在innodb表上做count(*)统计实在是太慢了,因此想办法看能不能再快点。
现象:先来看几个测试案例,如下
一、 sbtest 表上的测试

show create table sbtest\G
*************************** 1. row ***************************
Table: sbtest
Create Table: CREATE TABLE `sbtest` (
`aid` bigint(20) unsigned NOT NULL auto_increment,
`id` int(10) unsigned NOT NULL default '0',
`k` int(10) unsigned NOT NULL default '0',
`c` char(120) NOT NULL default '',
`pad` char(60) NOT NULL default '',
PRIMARY KEY  (`aid`),
KEY `k` (`k`),
KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1
show index from sbtest;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| sbtest |          0 | PRIMARY  |            1 | aid         | A         |     1000099 |     NULL | NULL   |      | BTREE      |         |
| sbtest |          1 | k        |            1 | k           | A         |          18 |     NULL | NULL   |      | BTREE      |         |
| sbtest |          1 | id       |            1 | id          | A         |     1000099 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

填充了 100万条 记录。
1、 直接 count(*)

explain SELECT COUNT(*) FROM sbtest;
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table  | type  | possible_keys | key     | key_len | ref  | rows    | Extra       |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+
|  1 | SIMPLE      | sbtest | index | NULL          | PRIMARY | 8       | NULL | 1000099 | Using index |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+
SELECT COUNT(*) FROM sbtest;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (1.42 sec)

可以看到,如果不加任何条件,那么优化器优先采用 primary key 来进行扫描。
2、count(*) 使用 primary key 字段做条件

explain SELECT COUNT(*) FROM sbtest WHERE aid>=0;
+----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table  | type  | possible_keys | key     | key_len | ref  | rows   | Extra                    |
+----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | sbtest | range | PRIMARY       | PRIMARY | 8       | NULL | 485600 | Using where; Using index |
+----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+
SELECT COUNT(*) FROM sbtest WHERE aid>=0;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (1.39 sec)

可以看到,尽管优化器认为只需要扫描 485600 条记录(其实是索引),比刚才少多了,但其实仍然要做全表(索引)扫描。因此耗时和第一种相当。

3、 count(*) 使用 secondary index 字段做条件

explain SELECT COUNT(*) FROM sbtest WHERE id>=0;
+----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+
| id | select_type | table  | type  | possible_keys | key  | key_len | ref  | rows   | Extra                    |
+----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | sbtest | range | id            | id   | 4       | NULL | 500049 | Using where; Using index |
+----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+
SELECT COUNT(*) FROM sbtest WHERE id>=0;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.43 sec)

可以看到,采用这种方式查询会非常快。
有人也许会问了,会不会是因为 id 字段的长度比 aid 字段的长度来的小,导致它扫描起来比较快呢?先不着急下结论,咱们来看看下面的测试例子。
二、 sbtest1 表上的测试

show create table sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
`aid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id` bigint(20) unsigned NOT NULL DEFAULT '0',
`k` int(10) unsigned NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`aid`),
KEY `k` (`k`),
KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1
show index from sbtest1;
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| sbtest1 |          0 | PRIMARY  |            1 | aid         | A         |     1000099 |     NULL | NULL   |      | BTREE      |         |
| sbtest1 |          1 | k        |            1 | k           | A         |          18 |     NULL | NULL   |      | BTREE      |         |
| sbtest1 |          1 | id       |            1 | id          | A         |     1000099 |     NULL | NULL   |      | BTREE      |         |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

这个表里,把 aid 和 id 的字段长度调换了一下,也填充了 1000万条 记录。
1、 直接 count(*)

explain SELECT COUNT(*) FROM sbtest1;
+----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows    | Extra       |
+----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+
|  1 | SIMPLE      | sbtest1 | index | NULL          | PRIMARY | 4       | NULL | 1000099 | Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+
SELECT COUNT(*) FROM sbtest1;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (1.42 sec)

可以看到,如果不加任何条件,那么优化器优先采用 primary key 来进行扫描。
2、count(*) 使用 primary key 字段做条件

explain SELECT COUNT(*) FROM sbtest1 WHERE aid>=0;
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | sbtest1 | range | PRIMARY       | PRIMARY | 4       | NULL | 316200 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)
SELECT COUNT(*) FROM sbtest1 WHERE aid>=0;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (1.42 sec)

可以看到,尽管优化器认为只需要扫描 485600 条记录(其实是索引),比刚才少多了,但其实仍然要做全表(索引)扫描。因此耗时和第一种相当。

3、 count(*) 使用 secondary index 字段做条件

explain SELECT COUNT(*) FROM sbtest1 WHERE id>=0;
+----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key  | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | sbtest1 | range | id            | id   | 8       | NULL | 500049 | Using where; Using index |
+----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)
SELECT COUNT(*) FROM sbtest1 WHERE id>=0;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.45 sec)

可以看到,采用这种方式查询会非常快。
上面的所有测试,均在 mysql 5.1.24 环境下通过,并且每次查询前都重启了 mysqld。
可以看到,把 aid 和 id 的长度调换之后,采用 secondary index 查询仍然是要比用 primary key 查询来的快很多。看来主要不是字段长度引起的索引扫描快慢,而是采用 primary key 以及 secondary index 引起的区别。那么,为什么用 secondary index 扫描反而比 primary key 扫描来的要快呢?我们就需要了解innodb的? clustered index?和 secondary index?之间的区别了。
innodb 的 clustered index 是把 primary key 以及 row data 保存在一起的,而 secondary index 则是单独存放,然后有个指针指向 primary key。因此,需要进行 count(*) 统计表记录总数时,利用 secondary index 扫描起来,显然更快。而primary key则主要在扫描索引,同时要返回结果记录时的作用较大,例如:

SELECT * FROM sbtest WHERE aid = xxx;

那既然是使用 secondary index 会比 primary key 更快,为何优化器却优先选择 primary key 来扫描呢, Heikki Tuuri?的回答是:

in the example table, the secondary index is inserted into in a perfect order! That is
very unusual. Normally the secondary index would be fragmented, causing random disk I/O,
and the scan would be slower than in the primary index.
I am changing this to a feature request: keep 'clustering ratio' statistics on a secondary
index and do the scan there if the order is almost the same as in the primary index. I
doubt this feature will ever be implemented, though.

详情请看: 这个 bug,以及这个文章: InnoDB Row Counting using Indexes。
最后感谢 老杨的帮助。

技术相关:?

MySQL优化

InnoDB

相关文章

  • 关于MySql explain 中的ID
  • mysql优化一般步聚(教程)
  • [存储引擎基础知识]InnoDB与MyISAM的六大区别
  • MySQL技术内幕:InnoDB存储-3.6 InnoDB存储引擎文件
  • MySQL之Handler_read_*

mysql中 innodb表的count()优化

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
图文详解mysql架构原理图文详解mysql架构原理May 17, 2022 pm 05:54 PM

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于架构原理的相关内容,MySQL Server架构自顶向下大致可以分网络连接层、服务层、存储引擎层和系统文件层,下面一起来看一下,希望对大家有帮助。

mysql怎么替换换行符mysql怎么替换换行符Apr 18, 2022 pm 03:14 PM

在mysql中,可以利用char()和REPLACE()函数来替换换行符;REPLACE()函数可以用新字符串替换列中的换行符,而换行符可使用“char(13)”来表示,语法为“replace(字段名,char(13),'新字符串') ”。

mysql怎么去掉第一个字符mysql怎么去掉第一个字符May 19, 2022 am 10:21 AM

方法:1、利用right函数,语法为“update 表名 set 指定字段 = right(指定字段, length(指定字段)-1)...”;2、利用substring函数,语法为“select substring(指定字段,2)..”。

mysql的msi与zip版本有什么区别mysql的msi与zip版本有什么区别May 16, 2022 pm 04:33 PM

mysql的msi与zip版本的区别:1、zip包含的安装程序是一种主动安装,而msi包含的是被installer所用的安装文件以提交请求的方式安装;2、zip是一种数据压缩和文档存储的文件格式,msi是微软格式的安装包。

mysql怎么将varchar转换为int类型mysql怎么将varchar转换为int类型May 12, 2022 pm 04:51 PM

转换方法:1、利用cast函数,语法“select * from 表名 order by cast(字段名 as SIGNED)”;2、利用“select * from 表名 order by CONVERT(字段名,SIGNED)”语句。

MySQL复制技术之异步复制和半同步复制MySQL复制技术之异步复制和半同步复制Apr 25, 2022 pm 07:21 PM

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于MySQL复制技术的相关问题,包括了异步复制、半同步复制等等内容,下面一起来看一下,希望对大家有帮助。

带你把MySQL索引吃透了带你把MySQL索引吃透了Apr 22, 2022 am 11:48 AM

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了mysql高级篇的一些问题,包括了索引是什么、索引底层实现等等问题,下面一起来看一下,希望对大家有帮助。

mysql怎么判断是否是数字类型mysql怎么判断是否是数字类型May 16, 2022 am 10:09 AM

在mysql中,可以利用REGEXP运算符判断数据是否是数字类型,语法为“String REGEXP '[^0-9.]'”;该运算符是正则表达式的缩写,若数据字符中含有数字时,返回的结果是true,反之返回的结果是false。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment