Home >Database >Mysql Tutorial >What is the performance principle of MySQL COUNT(*)

What is the performance principle of MySQL COUNT(*)

王林
王林forward
2023-05-27 10:49:37791browse

1.Which one is faster, COUNT(1), COUNT(*) or COUNT(field)?

Execution effect:

  • ##COUNT(*)MySQL performs count(*) In order to optimize, count(*) directly scans the primary key index record, does not take out all fields, and directly accumulates them by row.

  • COUNT(1)The InnoDB engine traverses the entire table, but does not take a value. The server layer puts a number "1" in each row returned. Accumulate by row.

  • COUNT(field)If this "field" is defined as NOT NULL, then the InnoDB engine will read this field from the record line by line, the server layer The judgment cannot be NULL and is accumulated row by row; if the "field" definition allows NULL, then the InnoDB engine will read this field from the record row by row, and then take out the value and judge it again. If it is not NULL, it will be accumulated.

Experimental Analysis

The environment used for testing in this article:

[root@zhyno1 ~]# cat /etc/system-release
CentOS Linux release 7.9.2009 (Core)

[root@zhyno1 ~]# uname -a
Linux zhyno1 3.10.0-1160.62.1.el7.x86_64 #1 SMP Tue Apr 5 16:57:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

The test database uses (storage engine Using InnoDB, other parameters are default):

(Mon Jul 25 09:41:39 2022)[root@GreatSQL][(none)]>select version();
+-----------+
| version() |
+-----------+
| 8.0.25-16 |
+-----------+
1 row in set (0.00 sec)

Experiment start:

#首先我们创建一个实验表

CREATE TABLE test_count (
  `id` int(10) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `name` varchar(20) NOT NULL,
  `salary` int(1) NOT NULL,
  KEY `idx_salary` (`salary`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

#插入1000W条数据
DELIMITER //
CREATE PROCEDURE insert_1000w()
BEGIN
    DECLARE i INT;
    SET i=1;
    WHILE i<=10000000 DO
        INSERT INTO test_count(name,salary) VALUES(&#39;KAiTO&#39;,1);
        SET i=i+1;
    END WHILE;
END//
DELIMITER ;
#执行存储过程
call insert_1000w();

Next, let’s experiment separately:

COUNT(1)It took 4.19 seconds

(Sat Jul 23 22:56:04 2022)[root@GreatSQL][test]>select count(1) from test_count;
+----------+
| count(1) |
+----------+
| 10000000 |
+----------+
1 row in set (4.19 sec)

COUNT(*)It took 4.16 seconds

(Sat Jul 23 22:57:41 2022)[root@GreatSQL][test]>select count(*) from test_count;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (4.16 sec)

COUNT (Field)It took 4.23 seconds

(Sat Jul 23 22:58:56 2022)[root@GreatSQL][test]>select count(id) from test_count;
+-----------+
| count(id) |
+-----------+
|  10000000 |
+-----------+
1 row in set (4.23 sec)

We can test the execution plan again

COUNT(*)

(Sat Jul 23 22:59:16 2022)[root@GreatSQL][test]>explain select count(*) from test_count;
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key        | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | test_count | NULL       | index | NULL          | idx_salary | 4       | NULL | 9980612 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

(Sat Jul 23 22:59:48 2022)[root@GreatSQL][test]>show warnings;
+-------+------+-----------------------------------------------------------------------+
| Level | Code | Message                                                               |
+-------+------+-----------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(0) AS `count(*)` from `test`.`test_count` |
+-------+------+-----------------------------------------------------------------------+
1 row in set (0.00 sec)

COUNT(1)

(Sat Jul 23 23:12:45 2022)[root@GreatSQL][test]>explain select count(1) from test_count;
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key        | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | test_count | NULL       | index | NULL          | idx_salary | 4       | NULL | 9980612 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

(Sat Jul 23 23:13:02 2022)[root@GreatSQL][test]>show warnings;
+-------+------+-----------------------------------------------------------------------+
| Level | Code | Message                                                               |
+-------+------+-----------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(1) AS `count(1)` from `test`.`test_count` |
+-------+------+-----------------------------------------------------------------------+
1 row in set (0.00 sec)

COUNT(field)

(Sat Jul 23 23:13:14 2022)[root@GreatSQL][test]>explain select count(id) from test_count;
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key        | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | test_count | NULL       | index | NULL          | idx_salary | 4       | NULL | 9980612 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

(Sat Jul 23 23:13:29 2022)[root@GreatSQL][test]>show warnings;
+-------+------+-----------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                       |
+-------+------+-----------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(`test`.`test_count`.`id`) AS `count(id)` from `test`.`test_count` |
+-------+------+-----------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

It should be noted that if there is a non-primary key field in COUNT

(Tue Jul 26 14:01:57 2022)[root@GreatSQL][test]>explain select count(name) from test_count where id <100 ;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | test_count | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |   99 |   100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Experimental results

  • 1. From the above experiment we can conclude that

    COUNT(*) and COUNT(1) is the fastest, followed by COUNT(id).

  • 2.

    count(*) was rewritten by the MySQL query optimizer into count(0), and the idx_salary index was selected.

  • 3.

    count(1) and count(id) both select the idx_salary index.

Experimental conclusion

Summary:

COUNT(*)=COUNT(1)>COUNT(id)

MySQL's official documentation also says:

InnoDB handles SELECT COUNT(*) and SELECT COUNT(1) operations in the same way. There is no performance difference

Translation: InnoDB handles SELECT COUNT(*) and SELECT COUNT(1) operations in the same way. There is no performance difference

So it means that for

COUNT(1) or COUNT(*), the optimization of MySQL is actually exactly the same, there is no There is no performance difference.

But it is recommended to use

COUNT(*), because this is the standard syntax for counting rows defined by MySQL92.

2.COUNT(*) and TABLES_ROWS

In InnoDB, the space occupied by each table of the MySQL database and the number of rows recorded in the table can be opened by opening the MySQL

information_schema database . There is a TABLES table in this library. The main fields of this table are:

  • TABLE_SCHEMA: Database name

  • TABLE_NAME:Table name

  • ENGINE:Storage engine used

  • TABLES_ROWS: Number of records

  • DATA_LENGTH: Data size

  • INDEX_LENGTH: Index Size

TABLE_ROWS is used to display how many rows the table currently has. This command is executed very quickly. Can this TABLE_ROWS replace

count(*)?

We use TABLES_ROWS to query the number of table records:

(Sat Jul 23 23:15:14 2022)[root@GreatSQL][test]>SELECT TABLE_ROWS
    -> FROM INFORMATION_SCHEMA.TABLES
    -> WHERE TABLE_NAME = &#39;test_count&#39;;
+------------+
| TABLE_ROWS |
+------------+
|    9980612 |
+------------+
1 row in set (0.03 sec)

You can see that the number of records is not accurate because the TABLES_ROWS row count under the InnoDB engine is only Approximate estimate.

3. How is COUNT(*) executed?

The first thing to make clear is that MySQL has many different engines. In different engines,

count(*) There are different implementation methods. This article mainly introduces the execution process on the InnoDB engine.

In the InnoDB storage engine, the

count(*) function first reads from the memory. Get the data in the table into the memory buffer, and then scan the entire table to get the number of row records. To put it simply, it is a full table scan. A loop solves the problem. Within the loop: First read a row, and then decide whether the row is included in count. The loop counts row by row.

In the MyISAM engine, the total number of rows of a table is stored on the disk, so when executing

count(*), this number will be returned directly, which is very efficient.

The reason why InnoDB does not store numbers like MyISAM is because even if there are multiple queries at the same time, due to multi-version concurrency control (MVCC), the number of rows that the InnoDB table should return does not matter. definite. InnoDB performs better than MyISAM in terms of transaction support, concurrency or data security.

Despite this, InnoDB has optimized the count(*) operation. InnoDB is an index-organized table. The leaf nodes of the primary key index tree are data, while the leaf nodes of the ordinary index tree are primary key values. Therefore, the ordinary index tree is much smaller than the primary key index tree. For operations like count(*), the results obtained by traversing any index tree are logically the same. Therefore, the MySQL optimizer will find the smallest tree to traverse.

It should be noted that What we discuss in this article is count(*) without filter conditions. If the WHERE condition is added, the MyISAM engine The table cannot return so quickly.

4. Summary

  • ##1.

    COUNT(*)=COUNT(1)>COUNT(id)

  • 2. The usage of COUNT function is mainly used to count the number of table rows. The main usages are

    COUNT(*), COUNT(field) and COUNT(1)

  • ##3. Because
  • COUNT(*)

    is SQL92 The defined standard syntax for counting the number of rows, so MySQL has made a lot of optimizations for it. MyISAM will directly record the total number of rows in the table for COUNT(*) query, while InnoDB will scan the table When choosing the smallest index to reduce costs. The premise of these optimizations is that there are no WHERE and GROUP conditional queries.

  • 4. In InnoDB, there is no difference in implementation between
  • COUNT(*)

    and COUNT(1), and the efficiency is the same, but COUNT(field)Needs to judge whether the field is NULL, so the efficiency will be lower.

  • 5. Because
  • COUNT(*)

    is the standard syntax for counting rows defined by SQL92 and is highly efficient, it is recommended to use COUNT(* )The number of rows in the query table.

  • 6. Just like the previous use case of
  • COUNT(name)

    , during the table creation process, it is necessary to establish a high-performance index according to business needs, and also pay attention to Avoid unnecessary indexing.

The above is the detailed content of What is the performance principle of MySQL COUNT(*). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete