Home >php教程 >PHP开发 >SQL finds all duplicate record data in a table

SQL finds all duplicate record data in a table

高洛峰
高洛峰Original
2016-11-16 11:00:541514browse

1. I encountered a problem during the interview, which is to write a table with two fields, id and name, and query all the data with repeated names. Now the list is as follows:

SELECT * from xi a where (a.username) in  (SELECT username from xi group by username  having count(*) > 1)

2. After querying and grouping all the data, The query data and the number of repetitions of duplicate data are listed below first:

SELECT count(username) as '重复次数',username from xi group by username  having count(*)>1 order by username desc

3. To check other people’s results, here are the following: A complete method of querying and deleting duplicate records

1. Find the redundant duplicate records in the table, Duplicate records are judged based on a single field (peopleId)

SELECT * from people
where peopleId in (select  peopleId  from  people  group  by  peopleId  having  count(peopleId) > 1)

2. Delete redundant duplicate records in the table. Duplicate records are judged based on a single field (peopleId), leaving only the record with the smallest rowid

DELETE from people 
where peopleId  in (select  peopleId  from people  group  by  peopleId   having  count(peopleId) > 1)
and rowid not in (select min(rowid) from  people  group by peopleId  having count(peopleId )>1)

In addition, in MySQL Using the above SQL will report an error: You can't specify target table 'XXX' for update in FROM clause. You cannot query the same table while modifying the table. Therefore, you can name the query result as a table like the following statement. Temporary tables are used as conditions for modifying table data to avoid the above errors.

DELETE from user where user_id in(
    (SELECT user_id from 
        (SELECT c.* from cab_user c where c.extra_id in 
            (SELECT  a.extra_id  from user a group  by  a.extra_id having  count(a.extra_id) > 1) and c.user_id not in 
            (SELECT min(b.user_id) from  user b group by b.extra_id having count(b.extra_id )>1)
        ) d
    )
)

3. Lookup the redundant duplicate records (multiple fields) in the table

SELECT * from vitae a
where (a.peopleId,a.seq) in  (SELECT peopleId,seq from vitae group by peopleId,seq  having count(*) > 1)

4. Delete the redundant duplicate records (multiple fields) in the table, leaving only the record with the smallest rowid

delete from vitae a
where (a.peopleId,a.seq) in  (SELECT peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

5. Lookup the table Redundant duplicate records (multiple fields), excluding the record with the smallest rowid

SELECT * from vitae a
where (a.peopleId,a.seq) in  (SELECT peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

(2)
For example
there is a field "name" in table A,
and the "name" value between different records is possible will be the same,
Now we need to check out the duplicate items with "name" value among the records in the table;

SELECT Name,Count(*) From A Group By Name Having Count(*) > 1

If we also check that the gender is also the same, then it is as follows:

SELECT Name,sex,Count(*) From A Group By Name,sex Having Count(*) > 1

(3)
Method 1 declare @max integer,@id integerdeclare cur_rows cursor local for select main field, count(*) from table name group by main field having count(*) >; 1open cur_rowsfetch cur_rows into @id,@maxwhile @@fetch_status=0beginselect @ max = @max -1set rowcount @maxdelete from table name where main field = @idfetch cur_rows into @id,@maxendclose cur_rowsset rowcount 0

Method 2 "Duplicate Records" has two meanings of duplicate records, one is completely duplicated Records, that is, records in which all fields are repeated, and second, records in which some key fields are repeated, such as the Name field is repeated, while other fields may not be repeated or can be ignored.

 1. For the first type of duplication, it is easier to solve. Use select distinct * from tableName to get a result set without duplicate records. If the table needs to delete duplicate records (retaining 1 duplicate record), you can delete it as follows: select distinct * into #Tmp from tableNamedrop table tableNameselect * into tableName from #Tmpdrop table #Tmp The reason for this duplication is that the table design is not correct It occurs in weeks and can be solved by adding a unique index column.

  2. This type of duplication problem usually requires retaining the first record among the duplicate records. The operation method is as follows. Suppose there are duplicate fields as Name and Address, and it is required to obtain the unique result set of these two fields select identity(int,1, 1) as autoID, * into #Tmp from tableNameselect min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoIDselect * from #Tmp where autoID in(select autoID from #tmp2) The last select is Name, Address is a non-duplicate result set (but there is an extra autoID field. This column can be omitted in the select clause when actually writing)

(4) Repeated query

SELECT * from tablename where id in (select id from tablenamegroup by idhaving count(id) > 1)


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn