Home >Database >Oracle >How to remove duplicate data in oracle

How to remove duplicate data in oracle

coldplay.xixi
coldplay.xixiOriginal
2021-01-04 17:36:2652663browse

Oracle method to remove duplicate data: 1. Find all duplicate rows for the specified column and delete them. The method is count having; 2. Delete all duplicate rows. The code is [delete from nayi224_180824 t where t.rowid in].

How to remove duplicate data in oracle

#The operating environment of this article: Windows7 system, oracle9i version, Dell G3 computer.

Recommended (free): oracle database

oracle removes duplicate data Method:

Create test data

create table nayi224_180824(col_1 varchar2(10), col_2 varchar2(10), col_3 varchar2(10));
insert into nayi224_180824select 1, 2, 3 from dual union allselect 1, 2, 3 from dual union allselect 5, 2, 3 from dual union allselect 10, 20, 30 from dual ;commit;select*from nayi224_180824;
##COL_1COL_2COL_3##11510
2 3
2 3
2 3
20 30
For the specified column, check the result set after deduplication

distinct

select distinct t1.* from nayi224_180824 t1;

##COL_1COL_2COL_310203022The method has great limitations. Because it can only deduplicate all query columns. If I want to deduplicate col_2 and col3, then my result set can only have col_2 and col_3 columns, but not col_1.
select distinct t1.col_2, col_3 from nayi224_180824 t1
##1
3 5
3

COL_2COL_3##23 2030row_number()
select *  from (select t1.*,
               row_number() over(partition by t1.col_2, t1.col_3 order by 1) rn          
               from nayi224_180824 t1) t1 where t1.rn = 1;
But it is also the simplest and easiest way to understand.

COL_1

COL_2COL_3RN123110 20301##It’s a lot more troublesome to write, but it has greater flexibility . For the specified column, find all duplicate rows

count having

select *  from nayi224_180824 t 
where (t.col_2, t.col_3) in (select t1.col_2, t1.col_3                                
from nayi224_180824 t1                               
group by t1.col_2, t1.col_3                              
having count(1) > 1)

COL_1

COL_2COL_3##123 123#The table needs to be checked twice, so the efficiency will be relatively low. Not recommended.
select *  from (select t1.*,               
count(1) over(partition by t1.col_2, t1.col_3) rn          
from nayi224_180824 t1) t1 where t1.rn > 1;
##5 2 3
count over

COL_1

COL_2

COL_3RN##12331233You only need to check the table once, recommended. is a slight modification of the above statement.
##5 2 3 3
Delete all duplicate rows
delete from nayi224_180824 t where t.rowid in (                   
select rid                     
from (select t1.rowid rid,                                   
count(1) over(partition by t1.col_2, t1.col_3) rn                              
from nayi224_180824 t1) t1                    
where t1.rn > 1);
Delete duplicate data and keep one

Analytical function method

delete from nayi224_180824 t where t.rowid in (select rid                     
from (select t1.rowid rid,
    
    row_number() over(partition by t1.col_2, t1.col_3 order by 1) rn                             
    from nayi224_180824 t1) t1                    
    where t1.rn > 1);

Has the consistent high flexibility of analytical functions . You can do whatever you want with the grouping and change the orderby clause to achieve requirements like "retain the maximum id".

group by

delete from nayi224_180824 t where t.rowid not in
       (select max(rowid) from nayi224_180824 t1 group by t1.col_2, t1.col_3);

Sacrifice some flexibility in exchange for higher efficiency.

The above is the detailed content of How to remove duplicate data in oracle. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn