Home >Database >Mysql Tutorial >How to Delete Duplicate PostgreSQL Rows While Keeping One Copy?
Preserving a Single Copy: Deleting Duplicate PostgreSQL Rows
In the realm of data management, encountering duplicate rows can pose a challenge. PostgreSQL, a popular relational database management system, offers various options for handling such situations. One specific scenario arises when users seek to delete duplicate rows while preserving a single copy from each set.
Query Solution
To achieve this, a SQL query can be employed. The following query follows the steps outlined in the provided article:
DELETE FROM foo WHERE id NOT IN (SELECT min(id) --or max(id) FROM foo GROUP BY hash)
Explanation
This query uses a subquery to identify the minimum or maximum ID value for each group of duplicate rows. The hash column represents a criterion based on which duplication is determined. By excluding the IDs not matching the minimum or maximum, the query preserves a single copy of each set of duplicates while deleting the rest.
Alternative Query
An alternative approach involves the use of the ROW_NUMBER() OVER () function:
DELETE FROM foo AS f WHERE ROW_NUMBER() OVER (PARTITION BY hash ORDER BY id) > 1
This query assigns a row number to each row, partitioned by the hash column and sorted by the id column in ascending order. Rows with row numbers greater than 1 (i.e., duplicates) are then deleted.
Conclusion
By implementing either of the presented queries in PostgreSQL, users can effectively delete duplicate rows while preserving a single copy from each set. This approach allows for the removal of redundant data while maintaining the integrity of the original dataset.
The above is the detailed content of How to Delete Duplicate PostgreSQL Rows While Keeping One Copy?. For more information, please follow other related articles on the PHP Chinese website!