Home  >  Q&A  >  body text

How to sort duplicate rows in a key-value pair table on multiple columns in MySQL?

<p>So I have the following table of key/value pairs, the user submits data via the form, and each question on the form is added to the table here as a separate row. <code>Submission_id</code> identifies each form submission. </p> <pre class="brush:php;toolbar:false;"> ---- --------------- --------------- -------- | id | submission_id | key | value | ---- --------------- --------------- -------- | 1 | 10 | manufacturer | Apple | | 2 | 10 | model | 5s | | 3 | 10 | firstname | Paul | | 4 | 15 | manufacturer | Apple | | 5 | 15 | model | 5s | | 6 | 15 | firstname | Paul | | 7 | 20 | manufacturer | Apple | | 8 | 20 | model | 5s | | 9 | 20 | firstname | Andrew | ---- --------------- -------------- -------- </pre> <p>As can be seen from the above data, the commits with id 10 and 15 have the same value (just the commit ids are different). This is basically because the user submitted the same form twice, so it's a duplicate. </p> <p>I'm trying to find a way to sort these tables so that any duplicate submissions appear together in order. Given the table above, I'm trying to construct a query that gives me the following results: </p> <pre class="brush:php;toolbar:false;"> --------------- | submission_id | --------------- | 10 | | 15 | | 20 | --------------- </pre> <p>So I want to check if a commit has the same value for the <code>manufacturer</code>, <code>model</code> and <code>firstname</code> keys. If so, then they get the commit id and they are placed adjacent to each other in the results. There are other keys in the actual table, but I only want to match duplicates based on these 3 keys (manufacturer, model, firstname). </p> <p>I've been thinking about this for a long time and trying to find some possible solutions, but haven't found anything reliable. </p>
P粉115840076P粉115840076388 days ago446

reply all(1)I'll reply

  • P粉659518294

    P粉6595182942023-09-06 00:57:21

    This is not a key-value table. Often called entity-attribute-value table/relationship/schema.

    Looking at the problem, if the tables were arranged in regular first and second normal form, this would be trivial - just concatenate the values, group by those values, and count....

    SELECT manufacturer, model, firstname, COUNT(DISTINCT submission_id)
    FROM atable
    GROUP BY  manufacturer, model, firstname
    HAVING COUNT(DISTINCT submission_id)>1;

    Or use connection....

    SELECT a.manufacturer, a.model, a.firstname
    , a.submission_id, b.submission_id
    FROM atable a
    JOIN atable b
    ON a.manufacturer=b.manufacturer
    AND a.model=b.model
    AND a.firstname=b.firstname
    WHERE a.submission_id<b.submission_id
    ;

    Or use sorting and comparing adjacent rows....

    SELECT *
    FROM
    (
    SELECT @prev.submission_id AS prev_submission_id
    , @prev.manufacturer AS prev_manufacturer
    , @prev.model AS prev_model
    , @prev.firstname AS pref_firstname
    , a.submission_id
    , a.manufacturer
    , a.model
    , set @prev.submission_id:=a.submission_id as currsid
    , set @prev.manufacturer:=a.manufacturer as currman
    , set @prev.model:=a.model as currmodel
    , set @prev.firstname=a.forstname as currname
    FROM atable
    ORDER BY manufacturer, model, firstname, submission_id
    )
    WHERE prev_manufacturer=manufacturer
    AND prev_model=model
    AND prev_firstname=firstname
    AND prev_submission_id<>submission_id;

    So the solution is to simply make your data look like a normal relationship....

    SELECT ilv.values
    , COUNT(ilv.submission_id)
    , GROUP_CONCAT(ilv.submission_id)
    FROM 
     (SELECT a.submission_id
      , GROUP_CONCAT(CONCAT(a.key, '=',a.value)) AS values
      FROM atable a
      GROUP BY a.submission_id
     ) ilv
    GROUP BY ilv.values
    HAVING COUNT(ilv.submission_id)>1;

    Hopefully concatenation and sequence based solutions should be obvious now.

    reply
    0
  • Cancelreply