Home  >  Article  >  Database  >  Oracle删除重复记录的几种方式

Oracle删除重复记录的几种方式

WBOY
WBOYOriginal
2016-06-07 17:29:31920browse

如果把一个文件多次导入数据库,可能会引入重复记录,那么有哪些方法可以删除重复记录呢?

Oracle删除重复记录的几种方式

如果把一个文件多次导入数据库,可能会引入重复记录,那么有哪些方法可以删除重复记录呢?

 REATE TABLE tbl_test(
  SER_NO NUMBER,
  FST_NM VARCHAR2(30),
  DEPTID NUMBER,
  CMNT  VARCHAR2(30));
 
 INSERT INTO tbl_test VALUES(1, 'aaaaa', 2004, 'xxx');
 INSERT INTO tbl_test VALUES(2, 'bbbbb', 2005, 'yyy');
 INSERT INTO tbl_test VALUES(1, 'aaaaa', 2004, 'xxx');
 INSERT INTO tbl_test VALUES(1, 'aaaaa', 2004, 'xxx');
 INSERT INTO tbl_test VALUES(3, 'ccccc', 2005, 'zzz');
 INSERT INTO tbl_test VALUES(2, 'bbbbb', 2005, 'yyy');
 
1.Using MIN(rowid) 最常用的方法,但是数据量大的话执行会很长时间
 
 DELETE FROM tbl_test
    WHERE ROWID NOT IN (SELECT  MIN (ROWID)
          FROM tbl_test
        GROUP BY ser_no, fst_nm, deptid, cmnt);
       
2.Using MIN(rowid) & Join 跟第一条差不多

 DELETE FROM tbl_test t
    WHERE t.ROWID NOT IN (SELECT MIN (b.ROWID)
          FROM tbl_test b
        WHERE b.ser_no = t.ser_no
          AND b.fst_nm = t.fst_nm
          AND b.deptid = t.deptid
          AND b.cmnt  = t.cmnt);
         
3.Using Subquery

 DELETE FROM tbl_test
 WHERE ser_no IN (SELECT ser_no FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
 AND fst_nm IN (SELECT fst_nm FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
 AND deptid IN (SELECT deptid FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
 AND cmnt  IN (SELECT cmnt  FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
 AND ROWID NOT IN (SELECT  MIN (ROWID)
 FROM tbl_test
    GROUP BY ser_no, fst_nm, deptid, cmnt
 HAVING COUNT (*) > 1)         
       
4. Using Nested Subqueries     

 DELETE FROM tbl_test a WHERE (a.ser_no, a.fst_nm, a.deptid, a.cmnt) IN (SELECT b.ser_no, b.fst_nm, b.deptid, b.cmnt
 FROM tbl_test b WHERE a.ser_no = b.ser_no AND a.fst_nm = b.fst_nm AND a.deptid = b.deptid AND a.cmnt  = b.cmnt AND
 a.ROWID  > b.ROWID);
               
5. Using Analytic Fucntions: 对于大表这是最有效的方法

 DELETE FROM tbl_test WHERE ROWID IN (SELECT rid FROM (SELECT ROWID rid,
 ROW_NUMBER () OVER (PARTITION BY ser_no, fst_nm, deptid, cmnt ORDER BY ROWID) rn FROM tbl_test)WHERE rn 1);             
     
6. CREATE-DROP-RENAME 对资源使用比较合理,,特别对于大表。但是如果需要回滚则会产生大量undo日志信息。
 
 CREATE  TABLE tbl_test1 NOLOGGING AS SELECT tbl_test .*
 FROM tbl_test tbl_test WHERE ROWID IN (SELECT rid
 FROM (SELECT ROWID rid, ROW_NUMBER() OVER (PARTITION BY ser_no, fst_nm, deptid, cmnt ORDER BY ROWID) rn
 FROM tbl_test) WHERE rn=1);
     
 DROP TABLE tbl_test; --drop the original table with lots of duplicate 
 
 RENAME tbl_test1 TO tbl_test; -- your original table without duplicates.

linux

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn