Working with large datasets in my mariaDB database. I have two tables, Table A contains 57 million rows and Table B contains about 500 rows. Table B is a subset of ids related to columns in Table A. I want to delete all rows in A table B.
Example table A:
classification_id | Name |
---|---|
20 | Mercedes |
30 | Kawasaki |
80 | Leitz |
70 | HP |
Example Table B:
classification_id | type |
---|---|
20 | car |
30 | bike |
40 | the bus |
50 | Boat |
So, in this example, the last two rows in table A will be deleted (or a mirrored table will be created containing only the first two rows, which is also OK).
I tried executing a second query using an inner join, but this query took a few minutes and gave an out of memory exception.
Any suggestions on how to resolve this issue?
P粉2580834322023-09-08 09:32:15
Since you say the filtered table contains a relatively small number of rows, your best option is to create a separate table with the same columns as the original table A
and rows that match your criteria , then replace the original table and delete it. Also, with this number of IDs, you may want to use WHERE IN ()
instead of a join - as long as the fields you use there are indexed, it will usually be way em> faster. Putting it all together:
CREATE TABLE new_A AS SELECT A.* FROM A WHERE classification_id IN (SELECT classification_id FROM B); RENAME TABLE A TO old_A, new_A to A; DROP TABLE old_A;
Things to note:
A
has any indexes or foreign keys, these will not be copied - so you must recreate them manually. I recommend running SHOW CREATE TABLE A
first and taking note of its structure. Alternatively, you might consider using the output of SHOW CREATE TABLE A
as a template to explicitly create table new_A
, and then do INSERT INTO new_A SELECT ... code> instead of CREATE TABLE new_A AS SELECT ...
and use the same query after that.
P粉1558329412023-09-08 00:49:06
Try this:
delete from "table A" where classification_id not in (select classification_id from "table B");