search

Home  >  Q&A  >  body text

MySQL SELECT with GROUP BY one field and ORDER BY another field

Edit 3 - MySQL version is 8.0.33.

Edit 2 - See bottom for final working code. Thanks @Akina!

I have a score sheet for a sporting event. The table has three related fields that I want to select -

  1. scoreID as primary key value
  2. classifierID Maps to the primary key of another table that contains details about a specific course layout
  3. calculatedPercent is the result of a specific event

The table has three other fields that I use in the WHERE clause, but these are incidental.

I need to generate a query that selects the four best values ​​for calculatedPercent and specifies that classifierID cannot be repeated. I need to be able to capture the scoreID for use in later stages of the process.

This is my first query:

SELECT `masterScores`.`scoreID`, `masterScores`.`classifierID`, `masterScores`.`calculatedPercent` 
FROM `masterScores` 
WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" 
ORDER BY `masterScores`.`calculatedPercent` DESC LIMIT 4

Initially I thought this was great because it does select the row with the highest calculatedPercent value for the given member scoreID value. Then I noticed that several members had the first and second highest marks in the same course, which violated the requirement that classifierID values ​​not be repeated.

I gave SELECT DISTINCT a try but eventually realized what I really needed was GROUP BY, so I did some research and found that I was getting errors related to only_full_group_by when executing queries in MySql, but that didn't completely solve it for me The problem.

What I tried next:

SELECT `masterScores`.`scoreID`, `masterScores`.`classifierID`, MAX(`masterScores`.`calculatedPercent`) AS bestPercent 
FROM `masterScores` 
WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" 
GROUP BY `masterScores`.`classifierID` 
ORDER BY bestPercent DESC LIMIT 4

This is the following error message:

#1055 - Expression #1 of the ORDER BY clause is not in the GROUP BY clause and contains the non-aggregated column '.masterScores.calculatedPercent', which is not functionally dependent on the columns in the GROUP BY clause; This is incompatible with sql_mode=only_full_group_by

I considered using MIN and MAX for the masterScores.scoreID column, but it doesn't work as expected; scoreID the primary key value is not always selected# The value of ##calculatedPercent. I read somewhere that since scoreID is the primary key, I can fix this by using ANY_VALUE aggregation. I tried this:

SELECT ANY_VALUE(`masterScores`.`scoreID`), `masterScores`.`classifierID`, MAX(`masterScores`.`calculatedPercent`) AS bestPercent 
FROM `masterScores` 
WHERE `masterScores`.`memberID` = 3516 AND `masterScores`.`eventDivision` = "O" AND `masterScores`.`scoreUnusable` != "TRUE" 
GROUP BY `masterScores`.`classifierID` 
ORDER BY bestPercent DESC LIMIT 4

At first glance, this does seem to work, but it does not always return a

scoreID value that matches the bestPercent value.

Again, the goal is:

  1. Based on the specified WHERE clause, only 1

    calculatedPercent and 1 scoreID value are selected for each classifierID. Without grouping by classifierID, each classifierID could have anywhere from 0 to 400 rows satisfying the WHERE clause, so I think GROUP BY would be appropriate here.

  2. Ensure that the

    calculatedPercent selected for each grouping's classifierID is the highest value of all options

  3. Make sure that only 4 rows are selected and that these are the rows with the highest

    calculatedPercent value selected.

  4. Ensure that the selected 4 rows are sorted in descending order based on the

    calculatedPercent value.

  5. Make sure that the

    scoreID value for each selected row actually represents the same row as the selected calculatedPercent (currently, this is the point at which the percentage is calculated) mine Query failed).

The following is a subset of data, for example:

Score ID Classifier ID Best Percentage
58007 42 66.60
63882 42 64.69
64685 54 64.31
58533 32 63.20
55867 42 62.28
66649 7 56.79
55392 12 50.28
58226 1 49.52
55349 7 41.10

This is the desired output when I run the query:

Score ID Classifier ID Best Percentage
58007 42 66.60
64685 54 64.31
58533 32 63.20
66649 7 56.79

This is the actual output when I run the query:

Score ID Classifier ID Best Percentage
55867 42 66.60
64685 54 64.31
58533 32 63.20
55349 7 56.79

As shown in the figure, the scoreID values ​​of the first and fourth rows of the actual output are incorrect.

Currently, I welcome any suggestions.

Edit 2 - Final working solution

WITH cte AS (
    SELECT scoreID, classifierID, calculatedPercent AS bestPercent,
           ROW_NUMBER() OVER (PARTITION BY classifierID ORDER BY calculatedPercent DESC, scoreID DESC) AS rn
    FROM masterScores WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE"
)
SELECT scoreID, classifierID, bestPercent
FROM cte
WHERE rn = 1
ORDER BY bestPercent DESC
LIMIT 4

I was able to test this against six cases where the problem occurred, and the solution resolved every issue. Thanks again @Akina!

will mark this issue as resolved.

P粉715274052P粉715274052459 days ago569

reply all(1)I'll reply

  • P粉696891871

    P粉6968918712023-09-11 13:48:48

    SELECT t1.scoreID, classifierID, calculatedPercent AS bestPercent 
    FROM masterScores t1
    NATURAL JOIN (
        SELECT classifierID, MAX(calculatedPercent) AS calculatedPercent
        FROM masterScores t2
        WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE" 
        GROUP BY 1
        ORDER BY calculatedPercent DESC LIMIT 4
        ) t2

    If (classifierID,calculatedPercent) is not unique, you may receive multiple rows for each classifierID. In this case you need

    SELECT MAX(t1.scoreID) AS scoreID, classifierID, calculatedPercent AS bestPercent 
    FROM masterScores t1
    NATURAL JOIN (
        SELECT classifierID, MAX(calculatedPercent) AS calculatedPercent
        FROM masterScores t2
        WHERE memberID = 3516 AND eventDivision = "O" AND scoreUnusable != "TRUE" 
        GROUP BY 1
        ORDER BY calculatedPercent DESC LIMIT 4
        ) t2
    GROUP BY 2, 3
    PS. If your MySQL version is 8, you must use ROW_NUMBER() in the CTE instead of a subquery.


    WITH cte AS (
        SELECT scoreID, classifierID, calculatedPercent AS bestPercent,
               ROW_NUMBER() OVER (PARTITION BY classifierID ORDER BY calculatedPercent DESC, scoreID DESC) AS rn
        FROM masterScores 
    )
    SELECT scoreID, classifierID, bestPercent
    FROM cte
    WHERE rn = 1

    reply
    0
  • Cancelreply