Home  >  Article  >  Database  >  How to Remove Duplicate Email Addresses in MySQL While Keeping the Most Recent Record?

How to Remove Duplicate Email Addresses in MySQL While Keeping the Most Recent Record?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-11-15 21:51:02597browse

How to Remove Duplicate Email Addresses in MySQL While Keeping the Most Recent Record?

MySQL: Deleting Duplicate Records While Retaining the Most Recent

Question: How can I eliminate duplicate email addresses in a MySQL table while preserving the most recent one based on the unique ID field?

Solution:

To achieve this, follow these steps:

  1. Identify Duplicate Emails:

    • Retrieve a list of all duplicate email addresses using the query:

      SELECT email
      FROM test
      GROUP BY email
      HAVING COUNT(*) > 1;
  2. Find the Latest ID for Each Duplicate:

    • Determine the latest ID associated with each duplicate email using the query:

      SELECT MAX(id) AS lastID, email
      FROM test
      WHERE email IN (
        SELECT email
        FROM test
        GROUP BY email
        HAVING COUNT(*) > 1
      )
      GROUP BY email;
  3. Delete Oldest Duplicates:

    • Perform a DELETE operation to remove duplicate emails with an ID less than the latest one:

      DELETE test
      FROM test
      INNER JOIN (
        SELECT MAX(id) AS lastID, email
        FROM test
        WHERE email IN (
          SELECT email
          FROM test
          GROUP BY email
          HAVING COUNT(*) > 1
        )
        GROUP BY email
      ) duplic ON duplic.email = test.email
      WHERE test.id < duplic.lastID;

Optimized Version:

The following optimized version provides the same result while potentially improving performance:

DELETE test
FROM test
INNER JOIN (
  SELECT MAX(id) AS lastID, email
  FROM test
  GROUP BY email
  HAVING COUNT(*) > 1) duplic ON duplic.email = test.email
WHERE test.id < duplic.lastID;

This revised DELETE statement focuses on eliminating the oldest duplicates.

Additional Option:

Alternatively, you can utilize this query provided by Rene Limon:

DELETE FROM test
WHERE id NOT IN (
  SELECT MAX(id)
  FROM test
  GROUP BY email);

This approach retains the most recent duplicate email address based on the maximum ID.

The above is the detailed content of How to Remove Duplicate Email Addresses in MySQL While Keeping the Most Recent Record?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn