Home  >  Q&A  >  body text

Count the highest group_id for each user as a distinct value

There's some tricky logic in my latest project (at least for me) and I've been trying to use HAVING to solve it, but all my plays have incorrect results.

I have a linked table that assigns users to user groups and another table that records how many users are in each group so I don't need to count them all the time (the second table is done after user registration updated from time to time).

The problem I have is with the function that updates the numbers in the second table.

The linked table looks like this, and here's the sample data that explains my problem:

+----+---------+----------+
| id | user_id | group_id |
+----+---------+----------+
| 1  | 1       | 1        |
| 2  | 1       | 5        |
| 3  | 14      | 2        |
+----+---------+----------+

A user can belong to any number of groups. In fact, in order to be a group, a user must belong to all groups below it. This is because each group is assigned certain permissions. This cannot be changed without a major rewrite of the current application.

What I want to do is count only the highest group for each user, so in the example data above, user 1 will only be counted in group 5, while group 1 will look empty.

Currently, all entries are counted, which means if there are 3 people in group 4, those 3 users are also counted in groups 3, 2, and 1, which makes my numbers relatively Meaningless.

So far I've tried a few variations of:

SELECT user_group_id, COUNT(user_id) members FROM `group_users` GROUP BY user_group_id HAVING count(user_group_id) = 1;

This gives me the same number. Then I tried:

SELECT user_group_id, COUNT(user_id) members FROM `group_users` GROUP BY user_group_id HAVING COUNT(user_id) = 1;

But as I understand now, I'm only asking users who don't appear in any other group.

I feel like I'm on the right track, but I don't know where or what the conditions should be. I've seen some example queries for a similar flow, but all the aliases just add to the confusion.

Can someone point me in the right direction?

P粉237125700P粉237125700277 days ago343

reply all(1)I'll reply

  • P粉431220279

    P粉4312202792024-01-17 13:39:13

    You can use aggregation to get the maximum group_id for each user and the COUNT() window function to count each returned maximum group_id> amount of users:

    SELECT DISTINCT MAX(group_id) AS group_id,
           COUNT(*) OVER (PARTITION BY MAX(group_id)) AS members
    FROM group_users
    GROUP BY user_id;

    This query filters out all group_id that have no members.

    If you want results for all group_id, connect to the above query using LEFT of table groups:

    WITH cte AS (
      SELECT DISTINCT MAX(group_id) AS group_id,
             COUNT(*) OVER (PARTITION BY MAX(group_id)) AS members
      FROM group_users
      GROUP BY user_id
    )
    SELECT g.group_id,
           COALESCE(c.members, 0) AS members
    FROM `groups` AS g LEFT JOIN cte AS c
    ON c.group_id = g.group_id;

    View a simplified demo.

    reply
    0
  • Cancelreply