Home  >  Q&A  >  body text

Retrieve rows in a SQL database that are fully compatible with each other

I have this simple structure in my database

CREATE TABLE species (
  _id INTEGER PRIMARY KEY,
  name TEXT NOT NULL
);
CREATE TABLE compatibility (
  _id INTEGER PRIMARY KEY,
  speciesA INTEGER,
  speciesB INTEGER,
  compatibility TINYINT NOT NULL
);

I would like to point out that speciesA and speciesB are composite unique to prevent duplication of information. For example, if I only had 5 species in my database, the compatibility table would look like this:

INSERT INTO species VALUES (1, 'EspecieA');
INSERT INTO species VALUES (2, 'EspecieB');
INSERT INTO species VALUES (3, 'EspecieC');
INSERT INTO species VALUES (4, 'EspecieD');
INSERT INTO species VALUES (5, 'EspecieD');

INSERT INTO compatibility VALUES (null, 1, 2, 1);
INSERT INTO compatibility VALUES (null, 1, 3, 1);
INSERT INTO compatibility VALUES (null, 1, 4, 1);
INSERT INTO compatibility VALUES (null, 1, 5, 0);
INSERT INTO compatibility VALUES (null, 2, 3, 1);
INSERT INTO compatibility VALUES (null, 2, 4, 1);
INSERT INTO compatibility VALUES (null, 2, 5, 0);
INSERT INTO compatibility VALUES (null, 3, 4, 1);
INSERT INTO compatibility VALUES (null, 3, 5, 1);
INSERT INTO compatibility VALUES (null, 4, 5, 1);

I need to write a query that returns a list of species that are fully compatible with each other from a given list of species, meaning that all species in the resulting list must be compatible with all species in the provided list. The species provided must not appear in the results list.

I tried the following query but it only returns species that are compatible with at least one of the provided species:

SELECT id, name
FROM species s
WHERE s.id NOT IN (
    SELECT IF(speciesA NOT IN (1,2,3), speciesA, speciesB) AS specie
    FROM compatibility 
    WHERE (speciesA IN (1,2,3)
    AND compatible IN (0)) OR (speciesB IN (1,2,3)
    AND compatible IN (0))
)
AND s.id NOT IN (1,2,3);

How can I modify this query to get a list of species that are fully compatible with each other?

For the query above, the expected result should be a species list containing only species 4. Species 1, 2, 3 are excluded from the list provided, and 5 should be excluded due to incompatibility with species 1 and 2.

Any help or advice would be greatly appreciated. Thanks!

P粉103739566P粉103739566427 days ago457

reply all(2)I'll reply

  • P粉684720851

    P粉6847208512023-09-10 10:46:11

    We can solve this problem through filtering and aggregation.

    select spec
    from (
        select 
            case when speciesA in (1, 2, 3) then speciesA else speciesB end as ref,
            case when speciesA in (1, 2, 3) then speciesB else speciesA end as spec
        from compatibility c
    ) c
    where ref in (1, 2, 3)
    group by spec
    having count(*) = 3

    The subquery places the reference species (ref) and the compatible species (spec) in two different columns. We can then filter the three reference species you are interested in, grouping by compatible species, and finally retain the group that exactly matches the having clause.

    If you also want to know the species name, we canjoin:

    select c.spec, s.name
    from (
        select 
            case when speciesA in (1, 2, 3) then speciesA else speciesB end as ref,
            case when speciesA in (1, 2, 3) then speciesB else speciesA end as spec
        from compatibility c
    ) c
    inner join species s on s._id = c.spec
    where c.ref in (1, 2, 3)
    group by c.spec
    having count(*) = 3

    DB Fiddle Demo

    reply
    0
  • P粉739942405

    P粉7399424052023-09-10 00:55:29

    This only includes species that are explicitly linked with compatibility = 1 (i.e. species are assumed to be incompatible by default)

    SELECT s._id, s.name
    FROM species s
    INNER JOIN compatibility c ON
        (s._id = c.speciesA AND c.speciesB IN (1, 2))
        OR (s._id = c.speciesB AND c.speciesA IN (1, 2))
    WHERE c.compatibility = 1
    AND s._id NOT IN (1, 2)

    This will include all species except those explicitly associated with compatibility = 0 (i.e. species are assumed to be compatible by default)

    SELECT s._id, s.name
    FROM species s
    LEFT JOIN compatibility c ON
        ((s._id = c.speciesA AND c.speciesB IN (1, 2))
        OR (s._id = c.speciesB AND c.speciesA IN (1, 2)))
        AND c.compatibility = 0
    WHERE c._id IS NULL
    AND s._id NOT IN (1, 2)

    Whichever of these fits your logic better, I also recommend adding indexes on the compatibility.speciesA and compatibility.speciesB columns to optimize query performance. p>

    reply
    0
  • Cancelreply