In the database, duplicate data is often one of the objects we need to delete. Oracle database provides multiple ways to remove duplicate data, and this article will introduce several of them.
UNIQUE constraints are a mechanism used by Oracle database to ensure that columns in each table store unique values. If we want to delete duplicate data in the table, we can add UNIQUE constraints on the columns that need to be deduplicated, and then insert data through the INSERT IGNORE or REPLACE INTO statement. During insertion, if duplicate data is found, it will be ignored or replaced with new data.
For example, we have a table named students, which contains the students' student numbers and names. If we want to ensure the uniqueness of the student ID, we can use the following statement:
ALTER TABLE students ADD CONSTRAINT unique_stu_id UNIQUE (stu_id);
In this statement, we add a UNIQUE constraint to the students table to ensure the uniqueness of the data in the stu_id column.
ROWID is a very special column in Oracle database, which can uniquely identify each row of data. We can delete duplicate data through ROWID. The following is an example of using ROWID to delete duplicate data:
DELETE FROM students WHERE ROWID NOT IN (SELECT MAX (ROWID) FROM students GROUP BY stu_id, name);
In this statement, we use a subquery to find the row of data with the largest ROWID value in each repeated stu_id and name combination, and then It is retained and the rest of the data is deleted.
Using temporary tables to remove duplicate data is another frequently used method. First we need to create a temporary table, then insert the data that needs to be deduplicated into the temporary table, then delete the data in the original table, and finally reinsert the data in the temporary table into the original table. This method can ensure data integrity and consistency, but it takes more time and space.
The following is an example of using a temporary table to delete duplicate data:
CREATE TABLE students_new AS SELECT DISTINCT * FROM students; TRUNCATE TABLE students; INSERT INTO students SELECT * FROM students_new; DROP TABLE students_new;
In this statement, we create a temporary table named students_new to remove duplicate data from the students table Insert into the temporary table, then clear the data in the students table, and finally reinsert the data in the temporary table into the students table to complete the deduplication operation.
CTE (Common Table Expression) is a method that can define a temporary table inside a SQL statement. Using CTE, we can complete the operation of deduplicating data in one SQL statement. The following is an example of using CTE to delete duplicate data:
WITH CTE AS ( SELECT stu_id, name, ROW_NUMBER() OVER (PARTITION BY stu_id, name ORDER BY ROWID) RN FROM students ) DELETE FROM CTE WHERE RN > 1;
In this statement, we use the WITH keyword to define a temporary table named CTE, and then use the ROW_NUMBER function to assign each row of data according to stu_id Number with name, and finally delete data with row numbers greater than 1 to complete the deduplication operation.
The above methods can effectively delete duplicate data in the Oracle database. Which method to choose depends on the actual situation and needs. For example, if we want to quickly delete a small amount of duplicate data, we can use the second method; if the amount of data is large, we can use the third method or the fourth method. In short, we should choose the most suitable method to delete duplicate data based on the actual situation, and we need to do backup and testing before deleting data to avoid data loss and operational errors.
The above is the detailed content of oracle data deduplication. For more information, please follow other related articles on the PHP Chinese website!