Home >Database >Mysql Tutorial >How to Keep Only One Copy of Duplicate Rows in PostgreSQL?

How to Keep Only One Copy of Duplicate Rows in PostgreSQL?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2025-01-06 10:15:43796browse

How to Keep Only One Copy of Duplicate Rows in PostgreSQL?

Preserving One Copy of Duplicate Rows in Postgres

When dealing with duplicate data, it's often necessary to remove the redundancies while retaining at least one instance of each duplicate set. In Postgres, this can be achieved through the following approach:

To delete duplicate rows while retaining a single copy, we can use the following query:

DELETE FROM foo
WHERE id NOT IN (
  SELECT MIN(id)  -- or MAX(id) for the latest copy
  FROM foo
  GROUP BY hash
);

In this query:

  • foo is the table containing the duplicate rows.
  • id is the primary key column.
  • hash is a column or expression that identifies duplicate rows (e.g., a unique combination of columns or a hash function).

By using MIN(id) or MAX(id) inside the subquery, we ensure that only the oldest or newest copy of each duplicate set is retained.

This approach ensures that at least one copy of each duplicate row is preserved in the table while removing the unnecessary duplicates.

The above is the detailed content of How to Keep Only One Copy of Duplicate Rows in PostgreSQL?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn