Thank you very much in advance for your support. I'm learning SQL and generating some data to play with, but I'm stuck on one simple thing. The table in the screenshot below shows a portion of my sales20 table that shows the sales achieved by each employee in 2020 along with region and date information.
Table Sales 20
I have the same table from 2021. What I want to do is; I want to provide the employee name, region, total sales achieved by the employee in 2020 and 2021. I wrote the following query but it is not giving the correct numbers. It brings something higher than it deserves. The important point is that employees in 2020 and 2021 are different. For example, there is a new employee in 2021, which means he should be listed as well, but the 2020 column for that employee should be empty.
My query:
SELECT sales20.staff, sales20.region, SUM(sales20.amount) AS Total_20, SUM(sales21.amount) AS Total_21 FROM sales20 JOIN sales21 ON sales20.staff = sales21.staff GROUP BY staff, region
Partial screenshot of the results:
result:
Can you tell me what I'm doing wrong?
P粉4764755512024-04-05 11:55:05
When you need to merge two tables, you are joining them -
SELECT staff, region, SUM(IF(YEAR(saledate) = '2020',amount,0) AS Total_20, SUM(IF(YEAR(saledate) = '2021',amount,0) AS Total_21, FROM (SELECT staff, region, amount, saledate FROM sales20 UNION ALL SELECT staff, region, amount, saledate FROM sales21) GROUP BY staff, region;
P粉1388714852024-04-05 00:24:31
SELECT sales20.staff, sales20.region, SUM(sales20.amount) AS Total_20, SUM(sales21.amount) AS Total_21 FROM sales20 JOIN sales21 ON sales20.staff = sales21.staff AND sales20.region = sales21.region GROUP BY staff, region
You need to join 2 tables based on person and region for a one to one join. If you join based only on employees then it will do a one to many join so you will get corrupted output.
Another approach is to merge the data from the two tables and then perform aggregation on this basis. This should give you accurate results.
WITH combined_data AS ( SELECT staff, region, saledate, amount FROM sales20 UNION ALL SELECT staff, region, saledate, amount FROM sales21 ) SELECT staff, region, SUM(CASE WHEN year(saledate) = 2020 THEN amount ELSE 0 END) AS Total_20, SUM(CASE WHEN year(saledate) = 2021 THEN amount ELSE 0 END) AS Total_21 FROM combined_data GROUP BY staff, region