search

Home  >  Q&A  >  body text

SQL processing to clean and sum duplicate data in joined tables

I am pulling reports from an accounting system. The system has a nasty habit of placing duplicates in the database, which it handles internally (instead of cleaning them up in the first place!)

Take the total table of invoice 125 as an example:

+------------+-----------+----------+
| invoice_id | code      | amount   |
+------------+-----------+----------+
|        125 | sub_total | 300.0000 |
|        125 | tax       |  30.0000 |
|        125 | total     | 330.0000 |
|        125 | sub_total | 300.0000 |
|        125 | tax       |  30.0000 |
|        125 | total     | 330.0000 |
+------------+-----------+----------+

and the invoice table with the same ID

+-----+----------+
| id  | amount   |
+-----+----------+
| 125 | 330.0000 |
+-----+----------+

I want to generate total sales and tax for a period (for Australian BAS)

My minimal working example query (which would work if the data was clean) is

select sum(a.amount) as total_sales, sum(c.amount) as total_GST
from 7cn_invoices a 
    INNER JOIN 7cn_invoice_totals c ON a.id = c.invoice_id
where c.code = 'tax';

However, due to duplicates in the totals table, I am getting twice the total sales that it should be. Aside from patching the code, is there any best way to solve this problem?

P粉354948724P粉354948724439 days ago531

reply all(1)I'll reply

  • P粉511985082

    P粉5119850822023-09-14 11:15:40

    You can remove duplicates by using a subquery with DISTINCT

    db<>fiddle here

    reply
    0
  • Cancelreply