search

Home  >  Q&A  >  body text

"Filter" huge MariaDB/Mysql tables based on different tables

Working with large datasets in my mariaDB database. I have two tables, Table A contains 57 million rows and Table B contains about 500 rows. Table B is a subset of ids related to columns in Table A. I want to delete all rows in A table B.

Example table A:

classification_id Name
20 Mercedes
30 Kawasaki
80 Leitz
70 HP

Example Table B:

classification_id type
20 car
30 bike
40 the bus
50 Boat

So, in this example, the last two rows in table A will be deleted (or a mirrored table will be created containing only the first two rows, which is also OK).

I tried executing a second query using an inner join, but this query took a few minutes and gave an out of memory exception.

Any suggestions on how to resolve this issue?

P粉287345251P粉287345251486 days ago612

reply all(2)I'll reply

  • P粉258083432

    P粉2580834322023-09-08 09:32:15

    Since you say the filtered table contains a relatively small number of rows, your best option is to create a separate table with the same columns as the original table A and rows that match your criteria , then replace the original table and delete it. Also, with this number of IDs, you may want to use WHERE IN () instead of a join - as long as the fields you use there are indexed, it will usually be way faster. Putting it all together:

    CREATE TABLE new_A AS
      SELECT A.* FROM A 
      WHERE classification_id IN (SELECT classification_id FROM B);
    RENAME TABLE A TO old_A, new_A to A;
    DROP TABLE old_A;

    Things to note:

    • Back up your data! And test the query thoroughly before running DROP TABLE. You don’t want to lose 57M rows of data because of random answers on StackOverflow.
    • If A has any indexes or foreign keys, these will not be copied - so you must recreate them manually. I recommend running SHOW CREATE TABLE A first and taking note of its structure. Alternatively, you might consider using the output of SHOW CREATE TABLE A as a template to explicitly create table new_A, and then do INSERT INTO new_A SELECT ... instead of CREATE TABLE new_A AS SELECT ... and use the same query after that.

    reply
    0
  • P粉155832941

    P粉1558329412023-09-08 00:49:06

    Try this:

    delete from "table A" where classification_id not in (select classification_id from "table B");

    reply
    0
  • Cancelreply