search

Home  >  Q&A  >  body text

SQL select only rows with maximum value on column

<p>I have this document form (here is a simplified version): </p> <table class="s-table"> <thead> <tr> <th>id</th> <th>Transfer</th> <th>Content</th> </tr> </thead> <tbody> <tr> <td>1</td> <td>1</td> <td>...</td> </tr> <tr> <td>2</td> <td>1</td> <td>...</td> </tr> <tr> <td>1</td> <td>2</td> <td>...</td> </tr> <tr> <td>1</td> <td>3</td> <td>...</td> </tr> </tbody> </table> <p>How to select one row per ID and only the largest RPM? </p><p> Using the above data, the result should contain two rows: <code>[1, 3, ...]</code> and <code>[2, ​​1, ..]</code>. I'm using<em><strong>MySQL</strong></em>. </p> <p>Currently, I'm using a check in a <code>while</code> loop to detect and overwrite old revs in the result set. But is this the only way to achieve this result? Don't have a <strong>SQL</strong> solution? </p>
P粉937382230P粉937382230517 days ago458

reply all(2)I'll reply

  • P粉638343995

    P粉6383439952023-08-24 09:47:12

    My preference is to use as little code as possible...

    You can do this using IN Try this:

    SELECT * 
    FROM t1 WHERE (id,rev) IN 
    ( SELECT id, MAX(rev)
      FROM t1
      GROUP BY id
    )

    In my opinion, it's less complex... easier to read and maintain.

    reply
    0
  • P粉517475670

    P粉5174756702023-08-24 09:16:15

    At first glance...

    All you need is a GROUP BY clause with a MAX aggregate function:

    SELECT id, MAX(rev)
    FROM YourTable
    GROUP BY id

    Things are never that simple, are they?

    I just noticed that you also need the content column.

    This is a very common problem in SQL: find all the data for a row with some maximum value in each group identifier column. I hear this statement often in my career. In fact, this was one of the questions I answered in a technical interview at my current job.

    In fact, the Stack Overflow community created a tag to address questions like this: .

    Basically, you have two ways to solve this problem:

    Use simple group-identifier, max-value-in-groupSubquery join

    In this approach you first find the group-identifier, max-value-in-group (solved above) in the subquery. Then, join the table to the subquery and make it equal on group-identifier and max-value-in-group:

    SELECT a.id, a.rev, a.contents
    FROM YourTable a
    INNER JOIN (
        SELECT id, MAX(rev) rev
        FROM YourTable
        GROUP BY id
    ) b ON a.id = b.id AND a.rev = b.rev

    Left join with self, adjust connection conditions and filters

    In this approach you left join the table to itself. Equality exists in group-identifier. Then, 2 smart moves:

    1. The second connection condition is that the value on the left is less than the value on the right
    2. When you do step 1, the row that actually has the largest value will show NULL on the right (that's a LEFT JOIN, remember?). We then filter the join results to only show rows with NULL on the right side.

    So you end up with:

    SELECT a.*
    FROM YourTable a
    LEFT OUTER JOIN YourTable b
        ON a.id = b.id AND a.rev < b.rev
    WHERE b.id IS NULL;

    in conclusion

    Both methods will bring exactly the same results.

    If you have two rows group-identifier with max-value-in-group, both rows will appear in the results of both methods.

    Both methods are SQL ANSI compatible, so they will work with your favorite RDBMS regardless of their "flavor".

    Both approaches are also performance-friendly, but your situation may differ (RDBMS, database structure, indexes, etc.). So baselines when you choose one method over another. And make sure you choose the one that makes the most sense to you.

    reply
    0
  • Cancelreply