search
HomeDatabaseMysql Tutorialmysql数据库分组(GROUP BY)查询实例_MySQL

bitsCN.com

1.使用松散(Loose)索引扫描实现 GROUP BY

何谓松散索引扫描实现 GROUP BY 呢?实际上就是当 MySQL 完全利用索引扫描来实现 GROUP BY 的时候,并不需要扫描所有满足条件的索引键即可完成操作得出结果。

下面我们通过一个示例来描述松散索引扫描实现 GROUP BY,在示例之前我们需要首先调整一下 group_message 表的索引,将 gmt_create 字段添加到 group_id 和 user_id 字段的索引中:

 代码如下 复制代码
1 sky@localhost : example 08:49:45> create index idx_gid_uid_gc
2
3 -> on group_message(group_id,user_id,gmt_create);
4
5 Query OK, rows affected (0.03 sec)
6
7 Records: 96 Duplicates: 0 Warnings: 0
8
9 sky@localhost : example 09:07:30> drop index idx_group_message_gid_uid
10
11 -> on group_message;
12
13 Query OK, 96 rows affected (0.02 sec)
14
15 Records: 96 Duplicates: 0 Warnings: 0

然后再看如下 Query 的执行计划:

 代码如下 复制代码

1 sky@localhost : example 09:26:15> EXPLAIN
2
3 -> SELECT user_id,max(gmt_create)
4
5 -> FROM group_message
6
7 -> WHERE group_id 8
9 -> GROUP BY group_id,user_idG
10
11 *************************** 1. row ***************************
12
13 id: 1
14
15 select_type: SIMPLE
16
17 table: group_message
18
19 type: range
20
21 possible_keys: idx_gid_uid_gc
22
23 key: idx_gid_uid_gc
24
25 key_len: 8
26
27 ref: NULL
28
29 rows: 4
30
31 Extra: Using where; Using index for group-by
32
33 1 row in set (0.00 sec)

我们看到在执行计划的 Extra 信息中有信息显示“Using index for group-by”,实际上这就是告诉我们,MySQL Query Optimizer 通过使用松散索引扫描来实现了我们所需要的 GROUP BY 操作。

下面这张图片描绘了扫描过程的大概实现:

要利用到松散索引扫描实现 GROUP BY,需要至少满足以下几个条件:

◆GROUP BY 条件字段必须在同一个索引中最前面的连续位置;

◆在使用GROUP BY 的同时,只能使用 MAX 和 MIN 这两个聚合函数;

◆如果引用到了该索引中 GROUP BY 条件之外的字段条件的时候,必须以常量形式存在;

为什么松散索引扫描的效率会很高?

因为在没有WHERE子句,也就是必须经过全索引扫描的时候, 松散索引扫描需要读取的键值数量与分组的组数量一样多,也就是说比实际存在的键值数目要少很多。而在WHERE子句包含范围判断式或者等值表达式的时候, 松散索引扫描查找满足范围条件的每个组的第1个关键字,并且再次读取尽可能最少数量的关键字。

2.使用紧凑(Tight)索引扫描实现 GROUP BY

紧凑索引扫描实现 GROUP BY 和松散索引扫描的区别主要在于他需要在扫描索引的时候,读取所有满足条件的索引键,然后再根据读取恶的数据来完成 GROUP BY 操作得到相应结果。

 代码如下 复制代码

1  sky@localhost : example 08:55:14> EXPLAIN
2
3 -> SELECT max(gmt_create)
4
5 -> FROM group_message
6
7 -> WHERE group_id = 2
8
9 -> GROUP BY user_idG
10
11 *************************** 1. row ***************************
12
13 id: 1
14
15 select_type: SIMPLE
16
17 table: group_message
18
19 type: ref
20
21 possible_keys: idx_group_message_gid_uid,idx_gid_uid_gc
22
23 key: idx_gid_uid_gc
24
25 key_len: 4
26
27 ref: const
28
29 rows: 4
30
31 Extra: Using where; Using index
32
33 1 row in set (0.01 sec)


这时候的执行计划的 Extra 信息中已经没有“Using index for group-by”了,但并不是说 MySQL 的 GROUP BY 操作并不是通过索引完成的,只不过是需要访问 WHERE 条件所限定的所有索引键信息之后才能得出结果。这就是通过紧凑索引扫描来实现 GROUP BY 的执行计划输出信息。

在 MySQL 中,MySQL Query Optimizer 首先会选择尝试通过松散索引扫描来实现 GROUP BY 操作,当发现某些情况无法满足松散索引扫描实现 GROUP BY 的要求之后,才会尝试通过紧凑索引扫描来实现。

当 GROUP BY 条件字段并不连续或者不是索引前缀部分的时候,MySQL Query Optimizer 无法使用松散索引扫描,设置无法直接通过索引完成 GROUP BY 操作,因为缺失的索引键信息无法得到。但是,如果 Query 语句中存在一个常量值来引用缺失的索引键,则可以使用紧凑索引扫描完成 GROUP BY 操作,因为常量填充了搜索关键字中的“差距”,可以形成完整的索引前缀。这些索引前缀可以用于索引查找。而如果需要排序GROUP BY结果,并且能够形成索引前缀的搜索关键字,MySQL还可以避免额外的排序操作,因为使用有顺序的索引的前缀进行搜索已经按顺序检索到了所有关键字。

3.使用临时表实现 GROUP BY

MySQL 在进行 GROUP BY 操作的时候要想利用所有,必须满足 GROUP BY 的字段必须同时存放于同一个索引中,且该索引是一个有序索引(如 Hash 索引就不能满足要求)。而且,并不只是如此,是否能够利用索引来实现 GROUP BY 还与使用的聚合函数也有关系。

前面两种 GROUP BY 的实现方式都是在有可以利用的索引的时候使用的,当 MySQL Query Optimizer 无法找到合适的索引可以利用的时候,就不得不先读取需要的数据,然后通过临时表来完成 GROUP BY 操作。

 代码如下 复制代码

1 sky@localhost : example 09:02:40> EXPLAIN
2
3 -> SELECT max(gmt_create)
4
5 -> FROM group_message
6
7 -> WHERE group_id > 1 and group_id 8
9 -> GROUP BY user_idG
10
11 *************************** 1. row ***************************
12 www.111cn.net
13 id: 1
14
15 select_type: SIMPLE
16
17 table: group_message
18
19 type: range
20
21 possible_keys: idx_group_message_gid_uid,idx_gid_uid_gc
22
23 key: idx_gid_uid_gc
24
25 key_len: 4
26
27 ref: NULL
28
29 rows: 32
30
31 Extra: Using where; Using index; Using temporary; Using filesort

这次的执行计划非常明显的告诉我们 MySQL 通过索引找到了我们需要的数据,然后创建了临时表,又进行了排序操作,才得到我们需要的 GROUP BY 结果。整个执行过程大概如下图所展示:

当 MySQL Query Optimizer 发现仅仅通过索引扫描并不能直接得到 GROUP BY 的结果之后,他就不得不选择通过使用临时表然后再排序的方式来实现 GROUP BY了。

在这样示例中即是这样的情况。 group_id 并不是一个常量条件,而是一个范围,而且 GROUP BY 字段为 user_id。所以 MySQL 无法根据索引的顺序来帮助 GROUP BY 的实现,只能先通过索引范围扫描得到需要的数据,然后将数据存入临时表,然后再进行排序和分组操作来完成 GROUP BY。

讲了这么多其实最简单的就是

(查询dedecms(织梦)程序的栏目标题表,以栏目id分组)

 代码如下 复制代码

SELECT *
FROM `dede_archives`
GROUP BY `typeid`
LIMIT 0 , 30

这样即可了

一些相关group by 实例

 代码如下 复制代码


--按某一字段分组取最大(小)值所在行的数据

/*
数据如下:
name val memo
a 2 a2(a的第二个值)
a 1 a1--a的第一个值
a 3 a3:a的第三个值
b 1 b1--b的第一个值
b 3 b3:b的第三个值
b 2 b2b2b2b2
b 4 b4b4
b 5 b5b5b5b5b5
*/

--创建表并插入数据:

create table tb(name varchar(10),val int,memo varchar(20))
insert into tb values('a', 2, 'a2(a的第二个值)')
insert into tb values('a', 1, 'a1--a的第一个值')
insert into tb values('a', 3, 'a3:a的第三个值')
insert into tb values('b', 1, 'b1--b的第一个值')
insert into tb values('b', 3, 'b3:b的第三个值')
insert into tb values('b', 2, 'b2b2b2b2')
insert into tb values('b', 4, 'b4b4')
insert into tb values('b', 5, 'b5b5b5b5b5')
go

--一、按name分组取val最大的值所在行的数据。

--方法1:select a.* from tb a where val = (select max(val) from tb where name = a.name) order by a.name
--方法2:
select a.* from tb a where not exists(select 1 from tb where name = a.name and val > a.val)
--方法3:
select a.* from tb a,(select name,max(val) val from tb group by name) b where a.name = b.name and a.val = b.val order by a.name
--方法4:
select a.* from tb a inner join (select name , max(val) val from tb group by name) b on a.name = b.name and a.val = b.val order by a.name
--方法5
select a.* from tb a where 1 > (select count(*) from tb where name = a.name and val > a.val ) order by a.name
/*
name val memo
---------- ----------- --------------------
a 3 a3:a的第三个值
b 5 b5b5b5b5b5

*/

本人推荐使用1,3,4,结果显示1,3,4效率相同,2,5效率差些,不过我3,4效率相同毫无疑问,1就不一样了,想不搞了。

--二、按name分组取val最小的值所在行的数据。

--方法1:select a.* from tb a where val = (select min(val) from tb where name = a.name) order by a.name
--方法2:
select a.* from tb a where not exists(select 1 from tb where name = a.name and val --方法3:
select a.* from tb a,(select name,min(val) val from tb group by name) b where a.name = b.name and a.val = b.val order by a.name
--方法4:
select a.* from tb a inner join (select name , min(val) val from tb group by name) b on a.name = b.name and a.val = b.val order by a.name
--方法5
select a.* from tb a where 1 > (select count(*) from tb where name = a.name and val /*
name val memo
---------- ----------- --------------------
a 1 a1--a的第一个值
b 1 b1--b的第一个值

*/

--三、按name分组取第一次出现的行所在的数据。

select a.* from tb a where val = (select top 1 val from tb where name = a.name) order by a.name
/*
name val memo
---------- ----------- --------------------
a 2 a2(a的第二个值)
b 1 b1--b的第一个值
*/

--四、按name分组随机取一条数据。

select a.* from tb a where val = (select top 1 val from tb where name = a.name order by newid()) order by a.name/*
name val memo
---------- ----------- --------------------
a 1 a1--a的第一个值
b 5 b5b5b5b5b5

*/

--五、按name分组取最小的两个(N个)val

select a.* from tb a where 2 > (select count(*) from tb where name = a.name and val select a.* from tb a where exists (select count(*) from tb where name = a.name and val /*
name val memo
---------- ----------- --------------------
a 1 a1--a的第一个值
a 2 a2(a的第二个值)
b 1 b1--b的第一个值
b 2 b2b2b2b2

*/

--六、按name分组取最大的两个(N个)val

select a.* from tb a where 2 > (select count(*) from tb where name = a.name and val > a.val ) order by a.name,a.val
select a.* from tb a where val in (select top 2 val from tb where name=a.name order by val desc) order by a.name,a.val
select a.* from tb a where exists (select count(*) from tb where name = a.name and val > a.val having Count(*) /*
name val memo
---------- ----------- --------------------
a 2 a2(a的第二个值)
a 3 a3:a的第三个值
b 4 b4b4
b 5 b5b5b5b5b5
*/


--七,假如整行数据有重复,所有的列都相同(例如下表中的第5,6两行数据完全相同)。
按name分组取最大的两个(N个)val

/*
数据如下:
name val memo
a 2 a2(a的第二个值)
a 1 a1--a的第一个值
a 1 a1--a的第一个值
a 3 a3:a的第三个值
a 3 a3:a的第三个值
b 1 b1--b的第一个值
b 3 b3:b的第三个值
b 2 b2b2b2b2
b 4 b4b4
b 5 b5b5b5b5b5

 你可能感兴趣的文章
  • mysql一对多关联查询的时候筛选条件
bitsCN.com
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Explain the InnoDB Buffer Pool and its importance for performance.Explain the InnoDB Buffer Pool and its importance for performance.Apr 19, 2025 am 12:24 AM

InnoDBBufferPool reduces disk I/O by caching data and indexing pages, improving database performance. Its working principle includes: 1. Data reading: Read data from BufferPool; 2. Data writing: After modifying the data, write to BufferPool and refresh it to disk regularly; 3. Cache management: Use the LRU algorithm to manage cache pages; 4. Reading mechanism: Load adjacent data pages in advance. By sizing the BufferPool and using multiple instances, database performance can be optimized.

MySQL vs. Other Programming Languages: A ComparisonMySQL vs. Other Programming Languages: A ComparisonApr 19, 2025 am 12:22 AM

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages ​​such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages ​​have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

Learning MySQL: A Step-by-Step Guide for New UsersLearning MySQL: A Step-by-Step Guide for New UsersApr 19, 2025 am 12:19 AM

MySQL is worth learning because it is a powerful open source database management system suitable for data storage, management and analysis. 1) MySQL is a relational database that uses SQL to operate data and is suitable for structured data management. 2) The SQL language is the key to interacting with MySQL and supports CRUD operations. 3) The working principle of MySQL includes client/server architecture, storage engine and query optimizer. 4) Basic usage includes creating databases and tables, and advanced usage involves joining tables using JOIN. 5) Common errors include syntax errors and permission issues, and debugging skills include checking syntax and using EXPLAIN commands. 6) Performance optimization involves the use of indexes, optimization of SQL statements and regular maintenance of databases.

MySQL: Essential Skills for Beginners to MasterMySQL: Essential Skills for Beginners to MasterApr 18, 2025 am 12:24 AM

MySQL is suitable for beginners to learn database skills. 1. Install MySQL server and client tools. 2. Understand basic SQL queries, such as SELECT. 3. Master data operations: create tables, insert, update, and delete data. 4. Learn advanced skills: subquery and window functions. 5. Debugging and optimization: Check syntax, use indexes, avoid SELECT*, and use LIMIT.

MySQL: Structured Data and Relational DatabasesMySQL: Structured Data and Relational DatabasesApr 18, 2025 am 12:22 AM

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

MySQL: Key Features and Capabilities ExplainedMySQL: Key Features and Capabilities ExplainedApr 18, 2025 am 12:17 AM

MySQL is an open source relational database management system that is widely used in Web development. Its key features include: 1. Supports multiple storage engines, such as InnoDB and MyISAM, suitable for different scenarios; 2. Provides master-slave replication functions to facilitate load balancing and data backup; 3. Improve query efficiency through query optimization and index use.

The Purpose of SQL: Interacting with MySQL DatabasesThe Purpose of SQL: Interacting with MySQL DatabasesApr 18, 2025 am 12:12 AM

SQL is used to interact with MySQL database to realize data addition, deletion, modification, inspection and database design. 1) SQL performs data operations through SELECT, INSERT, UPDATE, DELETE statements; 2) Use CREATE, ALTER, DROP statements for database design and management; 3) Complex queries and data analysis are implemented through SQL to improve business decision-making efficiency.

MySQL for Beginners: Getting Started with Database ManagementMySQL for Beginners: Getting Started with Database ManagementApr 18, 2025 am 12:10 AM

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools