Home  >  Article  >  Database  >  Detailed explanation of the usage of OVER (PARTITION BY ..) in Oracle queries

Detailed explanation of the usage of OVER (PARTITION BY ..) in Oracle queries

小云云
小云云Original
2017-12-11 13:23:118504browse

This article mainly introduces the usage of OVER (PARTITION BY ..) in Oracle queries. Please refer to the content and code. I hope it can help you. In order to facilitate everyone's learning and testing, all examples are created under Oracle's own user Scott.

Note: The red order by in the title indicates that order by must be included when using this method.

1. rank()/dense_rank() over(partition by ...order by ...)

Now the customer has a need to query the employees with the highest salary in each department Information, I believe that students with certain Oracle application knowledge can write the following SQL statement:

select e.ename, e.job, e.sal, e.deptno 
 from scott.emp e, 
    (select e.deptno, max(e.sal) sal from scott.emp e group by e.deptno) me 
 where e.deptno = me.deptno 
  and e.sal = me.sal;

While meeting customer needs, everyone should habitually Think about whether there is another way. This is for sure, just use the rank() over(partition by...) or dense_rank() over(partition by...) syntax in the title of this section. The SQL is as follows:

select e.ename, e.job, e.sal, e.deptno 
 from (select e.ename, 
        e.job, 
        e.sal, 
        e.deptno, 
        rank() over(partition by e.deptno order by e.sal desc) rank 
     from scott.emp e) e 
 where e.rank = 1;

select e.ename, e.job, e.sal, e.deptno 
 from (select e.ename, 
        e.job, 
        e.sal, 
        e.deptno, 
        dense_rank() over(partition by e.deptno order by e.sal desc) rank 
     from scott.emp e) e 
 where e.rank = 1;

Why does it get the same result as the above statement? Here is a supplementary explanation of the rank()/dense_rank() over(partition by e.deptno order by e.sal desc) syntax.

over: On what conditions.

partition by e.deptno: Partition (partition) by department number.

order by e.sal desc: Sort by salary from high to low (when using rank()/dense_rank(), order by must be included otherwise it is illegal)

rank()/dense_rank(): Grading

The meaning of the entire statement is: on the basis of division by department, employees are graded from high to low according to salary, and the "level" is from small to small. Represented by a large number (the minimum value must be 1).

So what is the difference between rank() and dense_rank()?

rank(): Jump sorting, if there are two first levels, the next one is the third level.

dense_rank(): Continuous sorting, if there are two first levels, the next one is still the second level.

Short job: Query the employee information of the department’s minimum wage.

2. min()/max() over(partition by...)

Now that we have obtained the highest/minimum salary of the department, the customer demand comes again, querying employee information At the same time, calculate the difference between the employee's salary and the department's highest/minimum salary. This is relatively simple. Based on the groupby statement in the first section, modify it as follows:

select e.ename, 
     e.job, 
     e.sal, 
     e.deptno, 
     e.sal - me.min_sal diff_min_sal, 
     me.max_sal - e.sal diff_max_sal 
  from scott.emp e, 
     (select e.deptno, min(e.sal) min_sal, max(e.sal) max_sal 
      from scott.emp e 
      group by e.deptno) me 
  where e.deptno = me.deptno 
  order by e.deptno, e.sal;

We used min() and max() above, The former finds the minimum value, and the latter finds the maximum value. What will be the effect if these two methods are used together with over(partition by...)? Take a look at the following SQL statement:

select e.ename, 
    e.job, 
    e.sal, 
    e.deptno, 
    nvl(e.sal - min(e.sal) over(partition by e.deptno), 0) diff_min_sal, 
    nvl(max(e.sal) over(partition by e.deptno) - e.sal, 0) diff_max_sal 
 from scott.emp e;

The query results of these two statements are the same. You can see that min() and max() actually The above values ​​are still the minimum and maximum values, but they are based on partition by partition.

Small homework: If you add order by in this example, what will be the result?

3. lead()/lag() over(partition by ... order by ...)

Chinese people love to compare, save face, and are world-famous. The customer liked this even more. After comparing it with the highest/minimum salary, he still felt that it was not enough. This time he put forward a rather abnormal demand, which was to calculate the difference between his personal salary and the salary of one person higher or lower than him. This requirement really makes me very embarrassed. I don't know how to implement it in the groupby statement. but. . . . Now that we have over(partition by...), everything looks so simple. As follows:

select e.ename, 
    e.job, 
    e.sal, 
    e.deptno, 
    lead(e.sal, 1, 0) over(partition by e.deptno order by e.sal) lead_sal, 
    lag(e.sal, 1, 0) over(partition by e.deptno order by e.sal) lag_sal, 
    nvl(lead(e.sal) over(partition by e.deptno order by e.sal) - e.sal, 
      0) diff_lead_sal, 
    nvl(e.sal - lag(e.sal) over(partition by e.deptno order by e.sal), 0) diff_lag_sal 
 from scott.emp e;

After reading the above sentence, do you think it was a false alarm? cold)? Let’s explain the two new methods used above.

lead(column name,n,m): The value of e147dec42e83e3b84883e9b9d2f9cc02 recorded in the nth row after the current record. If not, the default value is m; if there are no parameters n,m , then find the value of the record e147dec42e83e3b84883e9b9d2f9cc02 in the first row after the current record, if not, the default value is null.

lag(column name,n,m): The value of e147dec42e83e3b84883e9b9d2f9cc02 recorded in the nth row before the current record, if not, the default value is m; if there are no parameters n,m , then find the value of the record e147dec42e83e3b84883e9b9d2f9cc02 in the first row before the current record, if not, the default value is null.

Below are some common methods used in this grammar (Note: Methods with an order by clause indicate that order by must be included when using this method):

select e.ename, 
    e.job, 
    e.sal, 
    e.deptno, 
    first_value(e.sal) over(partition by e.deptno) first_sal, 
    last_value(e.sal) over(partition by e.deptno) last_sal, 
    sum(e.sal) over(partition by e.deptno) sum_sal, 
    avg(e.sal) over(partition by e.deptno) avg_sal, 
    count(e.sal) over(partition by e.deptno) count_num, 
    row_number() over(partition by e.deptno order by e.sal) row_num 
 from scott.emp e;

You may have some misunderstandings after reading this article, that is, OVER (PARTITION BY ..) is better than GROUP BY. This is not actually the case. The former cannot replace the latter, and The execution efficiency of the former is not as high as that of the latter, but the former provides more functions, so I hope you can choose according to your needs when using it.

Related recommendations:

Tips for Oracle program development

Comparison of cases of triggers used in Oracle and triggers used in mysql

99 commonly used query statements in oracle database

The above is the detailed content of Detailed explanation of the usage of OVER (PARTITION BY ..) in Oracle queries. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn