Home  >  Article  >  Database  >  Detailed explanation of the difference between distinct, row_number() and over()

Detailed explanation of the difference between distinct, row_number() and over()

Y2J
Y2JOriginal
2017-05-24 13:55:121563browse

This article mainly introduces the difference and usage information between Detailed explanation of the difference between distinct, row_number() and over() and row_number() over() in SQL. Friends in need can refer to it

1 Preface

When we write SQL statements to operate data in the database, we may encounter some uncomfortable problems. For example, for records with the same Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() in the same field, we only need to display one, but in fact the database may Contains multiple records with the same Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over(), so multiple records will be displayed during retrieval, which is against our original intention! Therefore, in order to avoid this situation from happening, we need to perform "duplication removal" processing. So what is "duplication removal"? To put it bluntly, it means that only one record will be displayed for records with the same content in the same field.

So, how to implement the "duplication removal" function? In this regard, we have two ways to achieve this function.

The first method is to add the Detailed explanation of the difference between distinct, row_number() and over() keyword when writing the select statement;

The second method is to call row_number() over() when writing the select statement Function .

Both of the above two methods can achieve the "duplication removal" function, so what are the similarities and differences between the two? Next, the author will give detailed instructions.

2 Detailed explanation of the difference between distinct, row_number() and over()

In SQL, the keyword Detailed explanation of the difference between distinct, row_number() and over() is used to retuDetailed explanation of the difference between distinct, row_number() and over() uniquely different values. The syntax format is:


SELECT DISTINCT 列名称 FROM 表名称

Assume there is a table "Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()", which contains two fields, NAME and AGE. The specific format is as follows:

Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

Observing the above table, we will find that there are two records with the same NAME and three records with the same AGE. If we run the following SQL statement,


/**
* 其中 PPPRDER 为 Schema 的名字,即表 Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() 在 PPPRDER 中
*/

select Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() from PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

will get the following result:

Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

Observe the result, We will find that among the above four records, there are two records with the same NAME value, that is, the values ​​of the 2nd record and the 3rd record are both "gavin". So, what if we want only one record with the same Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() to be displayed? At this time, you need to use the Detailed explanation of the difference between distinct, row_number() and over() keyword! Next, run the following SQL statement,


select Detailed explanation of the difference between distinct, row_number() and over() Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() from PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

will get the following results:

Detailed explanation of the difference between distinct, row_number() and over()

Observe the results, obviously Our request has been fulfilled! However, we can't help but wonder, what will be the effect if the Detailed explanation of the difference between distinct, row_number() and over() keyword is applied to two fields at the same time? Now that we have thought of it, let’s try it and run the following SQL statement,


select Detailed explanation of the difference between distinct, row_number() and over() Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over(), age from PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

The result obtained is as follows:

Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()andage

Observe the result, oops, it seems to have no effect? She displayed all the records! There are two records with the same NAME value and three records with the same AGE value. There is no change at all! But in fact, the result should be like this. Becausewhen Detailed explanation of the difference between distinct, row_number() and over() is applied to multiple fields, she will only "deduplicate" records with the same field values. Obviously our "poor" four records do not meet this condition, so Detailed explanation of the difference between distinct, row_number() and over() You will think that the above four records are not the same. It's empty talk. Next, let's add an identical record to the table "Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()" and verify it. The table after adding a record is as follows:

Detailed explanation of the difference between distinct, row_number() and over()

Run the following SQL statement again,


select Detailed explanation of the difference between distinct, row_number() and over() Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over(), age from PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

obtained The results are as follows:

Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()andage

#Observing this result perfectly verifies our conclusion above.

In addition, there is one thing that everyone needs to pay special attention to, that is: The keyword Detailed explanation of the difference between distinct, row_number() and over() can only work if it is placed at the front of all fields in the SQL statement. If it is placed in the wrong position, SQL will not report an error, but It won't have any effect.

3 row_number() over()

In the SQL Server database, a function row_number() is provided for us to use in the database table The records are numbered. When used, they are followed by a function over(), and the function of over() is to groupand sort the records in the table. The syntax used by both is:


ROW_NUMBER() OVER(PARTITION BY COLUMN1 ORDER BY COLUMN2)

means: group the records in the table by field COLUMN1 and sort by field COLUMN2, where

PARTITION BY: indicates grouping ORDER BY: indicates sorting

接下来,咱们还用表“Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()”中的数据进行测试。首先,给出没有使用 row_number() over() Detailed explanation of the difference between distinct, row_number() and over()时查询的结果,如下所示:

Detailed explanation of the difference between distinct, row_number() and over()

然后,运行如下 SQL 语句,


select PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over().*, row_number() over(partition by age order by Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() desc) from PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over()

得到的结果如下所示:

Detailed explanation of the difference between distinct, row_number() and over()

从上面的结果可以看出,其在原表的基础上,多了一列标有数字排序的列。那么反过来分析咱们运行的 SQL 语句,发现其确实按字段 AGE 的值进行分组了,也按字段 NAME 的值进行排序啦!因此,Detailed explanation of the difference between distinct, row_number() and over()的功能得到了验证。

接下来,咱们就研究如何用 row_number() over() Detailed explanation of the difference between distinct, row_number() and over()实现“去重”的功能。通过观察上面的结果,咱们可以发现,如果以 NAME 分组,以 AGE 排序,然后再取每组的第一个记录或许就可以实现“去重”的功能啊!那么试试看,运行如下 SQL 语句,


/*
* 其中 Detailed explanation of the difference between distinct, row_number() and over() 表示最后添加的那一列
*/

select * from 
(select PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over().*, row_number() over(partition by Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over() order by age desc) Detailed explanation of the difference between distinct, row_number() and over() from PPPRDER.Detailed explanation of the difference between Detailed explanation of the difference between distinct, row_number() and over(), row_number() and over())
where Detailed explanation of the difference between distinct, row_number() and over() = 1

运行后,得到的结果如下所示:

Detailed explanation of the difference between distinct, row_number() and over()

观察以上的结果,我们发现,哎呀,数据“去重”的功能一不小心就被咱们实现了啊!不过很遗憾,如果咱们细心的话,会发现一个很不爽的事情,那就是在执行以上 SQL 语句进行“去重”的时候,有一条 NAME 值为“gavin”、AGE 值为“18”的记录被过滤掉了,但是在现实生活会中,同名不同年龄的事情太正常了。

4 总结

通过阅读及实践以上内容,咱们已经知道了,无论是用关键字 Detailed explanation of the difference between distinct, row_number() and over() 还是用Detailed explanation of the difference between distinct, row_number() and over() row_number() over() 都可以实现数据“去重”的功能。但是在实现使用的过程中,咱们要特别注意两者的用法特点以及区别。

在使用关键字 Detailed explanation of the difference between distinct, row_number() and over() 的时候,咱们要知道其作用于单个字段和多个字段的时候是有区别的,作用于单个字段时,其“去重”的是表中所有该字段值重复的数据;作用于多个字段的时候,其“去重”的表中所有字段(即 Detailed explanation of the difference between distinct, row_number() and over() 具体作用的多个字段)值都相同的数据。

使用Detailed explanation of the difference between distinct, row_number() and over() row_number() over() 的时候,其是按先分组排序后,再取出每组的第一条记录来进行“去重”的(在本篇博文中如此)。当然,在此处咱们还可以通过不同的限制条件来进行“去重”,具体如何实现,就需要大家自己去动脑思考啦!

最后,在本篇博文中,作者详述了自己对用关键字 Detailed explanation of the difference between distinct, row_number() and over() 和Detailed explanation of the difference between distinct, row_number() and over() row_number() over() 进行数据“去重”的一些认识,希望以上的内容能够对大家有所帮助!

【相关推荐】

1. Mysql免费视频教程

2. 详解innodb_index_stats导入数据时 提示表主键冲突的错误

3. 实例详解 mysql中innodb_autoinc_lock_mode

4. MySQL中添加新用户权限的实例详解

5. 实例详解mysql中init_connect方法

The above is the detailed content of Detailed explanation of the difference between distinct, row_number() and over(). For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn