Home >Database >SQL >Understand the windowing function in SQL in one article

Understand the windowing function in SQL in one article

WBOY
WBOYforward
2022-09-02 16:55:483820browse

This article brings you relevant knowledge about SQL server. There are two types of windowing functions, also called analytic functions, one is the aggregation windowing function, and the other is the sorting windowing function. , The following article mainly introduces you to the relevant information about windowing functions in SQL. The article introduces it in detail through example code. Friends who need it can refer to it.

Understand the windowing function in SQL in one article

Recommended study: "SQL Tutorial"

Definition of OVER

OVER is used to define a row Window, which operates on a set of values, does not require the use of a GROUP BY clause to group data, and is able to return both base row columns and aggregate columns in the same row.

OVER syntax

OVER ( [ PARTITION BY column ] [ ORDER BY culumn ] )

PARTITION BY clause for grouping;

ORDER BY clause to sort.

Window function OVER() specifies a set of rows, and the windowing function calculates the value of each row in the result set output from the window function.

The windowing function can group data without using GROUP BY, and can also return the columns of the base row and the aggregated column at the same time.

Usage of OVER

OVER window function must be used together with aggregate function or sorting function. Aggregation function generally refers to SUM(), MAX(), MIN, COUNT(), AVG() and other common functions. Sorting functions generally refer to RANK(), ROW_NUMBER(), DENSE_RANK(), NTILE(), etc.

Examples of using OVER in aggregate functions

We use the SUM and COUNT functions as examples to demonstrate to you.

--建立测试表和测试数据
CREATE TABLE Employee
(
ID INT  PRIMARY KEY,
Name VARCHAR(20),
GroupName VARCHAR(20),
Salary INT
)
INSERT INTO  Employee
VALUES(1,'小明','开发部',8000),
      (4,'小张','开发部',7600),
      (5,'小白','开发部',7000),
      (8,'小王','财务部',5000),
      (9, null,'财务部',NULL),
      (15,'小刘','财务部',6000),
      (16,'小高','行政部',4500),
      (18,'小王','行政部',4000),
      (23,'小李','行政部',4500),
      (29,'小吴','行政部',4700);

Windowing function after SUM

SELECT *,
     SUM(Salary) OVER(PARTITION BY Groupname) 每个组的总工资,
     SUM(Salary) OVER(PARTITION BY groupname ORDER BY ID) 每个组的累计总工资,
     SUM(Salary) OVER(ORDER BY ID) 累计工资,
     SUM(Salary) OVER() 总工资
from Employee

(Tip: You can slide the code left and right)

The results are as follows:

The meaning of each windowing function is different, let’s explain it in detail:

SUM(Salary) OVER (PARTITION BY Groupname)

Only for PARTITION BY The following column Groupname is grouped, and the sum of Salary is calculated after grouping.

SUM(Salary) OVER (PARTITION BY Groupname ORDER BY ID)

For the column Groupname after PARTITION BY Group, then sort by the ID after ORDER BY, and then accumulate Salary within the group.

SUM(Salary) OVER (ORDER BY ID)

ORDER BY only Sort the ID content after sorting, and accumulate the sorted Salary.

SUM(Salary) OVER ()

Summary processing of Salary

After COUNT The result returned by the windowing function

SELECT *,
       COUNT(*) OVER(PARTITION BY Groupname ) 每个组的个数,
       COUNT(*) OVER(PARTITION BY Groupname ORDER BY ID) 每个组的累积个数,
       COUNT(*) OVER(ORDER BY ID) 累积个数 ,
       COUNT(*) OVER() 总个数
from Employee

is as follows:

Each subsequent windowing function will no longer be interpreted one by one. You can refer to the above The windowing functions after SUM are compared one by one.

Examples of using OVER in sorting functions

We demonstrate the four sorting functions one by one

--先建立测试表和测试数据
WITH t AS
(SELECT 1 StuID,'一班' ClassName,70 Score
UNION ALL
SELECT 2,'一班',85
UNION ALL
SELECT 3,'一班',85
UNION ALL
SELECT 4,'二班',80
UNION ALL
SELECT 5,'二班',74
UNION ALL
SELECT 6,'二班',80
)
SELECT * INTO Scores FROM t;
SELECT * FROM Scores

ROW_NUMBER()

Definition : The function of the ROW_NUMBER() function is to sort the data queried by SELECT. Each piece of data is added with a serial number. It cannot be used for ranking students' grades. It is generally used for paging queries, such as querying the top 10 queries 10- 100 students. ROW_NUMBER() must be used together with ORDER BY, otherwise an error will be reported.

Sort student scores

SELECT *,
ROW_NUMBER() OVER (PARTITION BY ClassName ORDER BY SCORE DESC) 班内排序,
ROW_NUMBER() OVER (ORDER BY SCORE DESC) AS 总排序
FROM Scores;

The results are as follows:

The functions of PARTITION BY and ORDER BY here are the same as what we saw above The aggregate functions have the same function and are used for grouping and sorting.

In addition, the ROW_NUMBER() function can also take data in a specified order.

SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY SCORE DESC) AS 总排序
FROM Scores
) t WHERE t.总排序=2;

The results are as follows:

RANK()

Definition: RANK() function, as the name suggests, a ranking function that can rank a certain field Ranking, what is the difference between this and ROW_NUMBER()? ROW_NUMBER() is sorting. When there are students with the same grades, ROW_NUMBER() will sort them in sequence. Their serial numbers are different, but Rank() is different. If they appear the same, their rankings are the same. Look at the example below:

Example

SELECT ROW_NUMBER() OVER (ORDER BY SCORE DESC) AS [RANK],*
FROM Scores;
 
SELECT RANK() OVER (ORDER BY SCORE DESC) AS [RANK],*
FROM Scores;

Result:

##The above picture is ROW_NUMBER( ), the figure below is the result of RANK(). When two students have the same grades, there is a change. RANK() is 1-1-3-3-5-6, while ROW_NUMBER() is still 1-2-3-4-5-6. This is the difference between RANK() and ROW_NUMBER().

DENSE_RANK() 

定义:DENSE_RANK()函数也是排名函数,和RANK()功能相似,也是对字段进行排名,那它和RANK()到底有什么不同那?特别是对于有成绩相同的情况,DENSE_RANK()排名是连续的,RANK()是跳跃的排名,一般情况下用的排名函数就是RANK() 我们看例子:

示例

SELECT 
RANK() OVER (ORDER BY SCORE DESC) AS [RANK],*
FROM Scores;
 
SELECT 
DENSE_RANK() OVER (ORDER BY SCORE DESC) AS [RANK],*
FROM Scores;

结果如下:

上面是RANK()的结果,下面是DENSE_RANK()的结果

NTILE()

定义:NTILE()函数是将有序分区中的行分发到指定数目的组中,各个组有编号,编号从1开始,就像我们说的'分区'一样 ,分为几个区,一个区会有多少个。  

SELECT *,NTILE(1) OVER (ORDER BY SCORE DESC) AS 分区后排序 FROM Scores;
SELECT *,NTILE(2) OVER (ORDER BY SCORE DESC) AS 分区后排序 FROM Scores;
SELECT *,NTILE(3) OVER (ORDER BY SCORE DESC) AS 分区后排序 FROM Scores;

结果如下:

就是将查询出来的记录根据NTILE函数里的参数进行平分分区。

总结

OVER开窗函数是我们工作中经常要使用到的,特别是在做数据分析计算的时候,经常要对数据进行分组排序。上面我们额外介绍了聚合函数和排序函数的与OVER结合的使用方法,此外还有很多与OVER一起使用的函数,比如LEAD函数,LAG函数,STRING_AGG函数等等都会使用到开窗函数OVER,其使用方法也要务必掌握。

推荐学习:《SQL教程

The above is the detailed content of Understand the windowing function in SQL in one article. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:jb51.net. If there is any infringement, please contact admin@php.cn delete