Edit 3 - MySQL version is 8.0.33.
Edit 2 - See bottom for final working code. Thanks @Akina!
I have a score sheet for a sporting event. The table has three related fields that I want to select -
scoreID
as primary key valueclassifierID
Maps to the primary key of another table that contains details about a specific course layoutcalculatedPercent
is the result of a specific eventThe table has three other fields that I use in the WHERE clause, but these are incidental.
I need to generate a query that selects the four best values for calculatedPercent
and specifies that classifierID
cannot be repeated. I need to be able to capture the scoreID
for use in later stages of the process.
This is my first query:
SELECT `masterScores`.`scoreID`, `masterScores`.`classifierID`, `masterScores`.`calculatedPercent` FROM `masterScores` WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" ORDER BY `masterScores`.`calculatedPercent` DESC LIMIT 4
Initially I thought this was great because it does select the row with the highest calculatedPercent
value for the given member scoreID
value. Then I noticed that several members had the first and second highest marks in the same course, which violated the requirement that classifierID
values not be repeated.
I gave SELECT DISTINCT a try but eventually realized what I really needed was GROUP BY, so I did some research and found that I was getting errors related to only_full_group_by when executing queries in MySql, but that didn't completely solve it for me The problem.
What I tried next:
SELECT `masterScores`.`scoreID`, `masterScores`.`classifierID`, MAX(`masterScores`.`calculatedPercent`) AS bestPercent FROM `masterScores` WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" GROUP BY `masterScores`.`classifierID` ORDER BY bestPercent DESC LIMIT 4
This is the following error message:
#1055 - Expression #1 of the ORDER BY clause is not in the GROUP BY clause and contains the non-aggregated column '.masterScores.calculatedPercent', which is not functionally dependent on the columns in the GROUP BY clause; This is incompatible with sql_mode=only_full_group_by
I considered using MIN and MAX for the masterScores
.scoreID
column, but it doesn't work as expected; scoreID
the primary key value is not always selected# The value of ##calculatedPercent. I read somewhere that since
scoreID is the primary key, I can fix this by using ANY_VALUE aggregation. I tried this:
SELECT ANY_VALUE(`masterScores`.`scoreID`), `masterScores`.`classifierID`, MAX(`masterScores`.`calculatedPercent`) AS bestPercent FROM `masterScores` WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" GROUP BY `masterScores`.`classifierID` ORDER BY bestPercent DESC LIMIT 4At first glance, this does seem to work, but it does not always return a
scoreID value that matches the bestPercent value.
calculatedPercent and 1
scoreID value are selected for each
classifierID. Without grouping by
classifierID, each
classifierID could have anywhere from 0 to 400 rows satisfying the WHERE clause, so I think GROUP BY would be appropriate here.
calculatedPercent selected for each grouping's
classifierID is the highest value of all options
calculatedPercent value selected.
calculatedPercent value.
scoreID value for each selected row actually represents the same row as the selected
calculatedPercent (currently, this is the point at which the percentage is calculated) mine Query failed).
Score ID | Classifier ID | Best Percentage |
---|---|---|
58007 | 42 | 66.60 |
63882 | 42 | 64.69 |
64685 | 54 | 64.31 |
58533 | 32 | 63.20 |
55867 | 42 | 62.28 |
66649 | 7 | 56.79 |
55392 | 12 | 50.28 |
58226 | 1 | 49.52 |
55349 | 7 | 41.10 |
This is the desired output when I run the query:
Score ID | Classifier ID | Best Percentage |
---|---|---|
58007 | 42 | 66.60 |
64685 | 54 | 64.31 |
58533 | 32 | 63.20 |
66649 | 7 | 56.79 |
This is the actual output when I run the query:
Score ID | Classifier ID | Best Percentage |
---|---|---|
55867 | 42 | 66.60 |
64685 | 54 | 64.31 |
58533 | 32 | 63.20 |
55349 | 7 | 56.79 |
As shown in the figure, the scoreID
values of the first and fourth rows of the actual output are incorrect.
Currently, I welcome any suggestions.
Edit 2 - Final working solution
WITH cte AS ( SELECT scoreID, classifierID, calculatedPercent AS bestPercent, ROW_NUMBER() OVER (PARTITION BY classifierID ORDER BY calculatedPercent DESC, scoreID DESC) AS rn FROM masterScores WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE" ) SELECT scoreID, classifierID, bestPercent FROM cte WHERE rn = 1 ORDER BY bestPercent DESC LIMIT 4
I was able to test this against six cases where the problem occurred, and the solution resolved every issue. Thanks again @Akina!
will mark this issue as resolved.
P粉6968918712023-09-11 13:48:48
SELECT t1.scoreID, classifierID, calculatedPercent AS bestPercent FROM masterScores t1 NATURAL JOIN ( SELECT classifierID, MAX(calculatedPercent) AS calculatedPercent FROM masterScores t2 WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE" GROUP BY 1 ORDER BY calculatedPercent DESC LIMIT 4 ) t2
If (classifierID,calculatedPercent)
is not unique, you may receive multiple rows for each classifierID
. In this case you need
SELECT MAX(t1.scoreID) AS scoreID, classifierID, calculatedPercent AS bestPercent FROM masterScores t1 NATURAL JOIN ( SELECT classifierID, MAX(calculatedPercent) AS calculatedPercent FROM masterScores t2 WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE" GROUP BY 1 ORDER BY calculatedPercent DESC LIMIT 4 ) t2 GROUP BY 2, 3PS. If your MySQL version is 8, you must use
ROW_NUMBER()
in the CTE instead of a subquery.
WITH cte AS ( SELECT scoreID, classifierID, calculatedPercent AS bestPercent, ROW_NUMBER() OVER (PARTITION BY classifierID ORDER BY calculatedPercent DESC, scoreID DESC) AS rn FROM masterScores ) SELECT scoreID, classifierID, bestPercent FROM cte WHERE rn = 1