Requirement: Delete the items with the same primaryid and drugname in DRUG, and keep the smallest row of drug_seq
Question: Mysql uses these methods to remove duplicates, why can only a few hundred or dozens of items be removed each time? Duplicate data is just like squeezing out toothpaste. Every time you execute it, you can remove a little more duplication. There are obviously hundreds of thousands of duplicate data. I have tried several methods, but they are still incomplete.
#Method 1:
DELETE FROM `drug2022` WHERE drug_seq IN(
SELECT drug_seq FROM (
SELECT drug_seq FROM `drug2022` WHERE (primaryid,drugname) IN (SELECT primaryid,drugname FROM `drug2022` GROUP BY primaryid,drugname HAVING COUNT(*) > 1)
AND
drug_seq NOT IN (SELECT MIN(drug_seq) FROM `drug2022` GROUP BY primaryid,drugname HAVING COUNT (*) > 1))AS a1);
#Method 2:
DELETE
FROM `drug2022`
WHERE drug_seq NOT IN (
) (SELECT t1.min_drug_seq
) FROM (SELECT MIN(drug_seq) AS min_drug_seq FROM `drug2022` GROUP BY drugname, primaryid HAVING COUNT(1) > 1) t1))
AND (drugname, primaryid) IN
(SELECT t2.drugname, t2.primaryid
) FROM (SELECT drugname, primaryid FROM `drug2022` GROUP BY drugname , primaryid HAVING COUNT(1) > 1) t2);
#Method 3:
DELETE t1
FROM `drug2022` t1,
`drug2022` t2
WHERE t1.primaryid = t2.primaryid
AND t1.drugname = t2.drugname
AND t1.drug_seq < t2.drug_seq;
#Method 4
DELETE
FROM `drug2022`
WHERE drug_seq NOT IN (SELECT * FROM (SELECT MIN(drug_seq) FROM `drug2022` GROUP BY primaryid, drugname) t2);