search

Home  >  Q&A  >  body text

How to match strings with appended parts using Python, but not match them if their appended parts are different

How to match strings with appended parts, but not match them if they have different appended parts? For example, if I have a list of names and need to match them to names in my database:

|          My DataBase      |       Foreign table       |
| David James Malan Neeko   | David James Malan         |
| David James Malan Neeko   | David James Malan Mathew  |
| David James Malan Neeko   | David jam Mlan            |

I've tried doing an exact match first, and if nothing is found, I keep removing a character from the end of the external name string and try matching again until only one record matches my database.

This technique matches the first and third examples in the table above, but the problem is that it also matches the second example because the program removes one character each time until the entire Mathew word is removed, and then it is found gives a match, which is a false match in this case.

Any suggestions where I can do an exact match only or match but only have the additional part of the name on one side, but not match them if they both have two different parts of the name.

You can use SQL or Python to solve this problem.

Thanks in advance for your help.

P粉676588738P粉676588738489 days ago743

reply all(1)I'll reply

  • P粉145543872

    P粉1455438722023-09-21 11:25:47

    This is very inefficient and will require a full table scan.

    SET @name_str = 'David James Malan Neeko';
    
    SELECT *
    FROM tbl
    WHERE name LIKE CONCAT(@name_str, '%') OR @name_str LIKE CONCAT(name, '%')
    ORDER BY name = @name_str DESC, name LIKE CONCAT(@name_str, '%') DESC;
    
    The first expression in the

    ORDER BY clause will sort for exact matches first, then the second expression will sort for name that starts with the entire search string. Sort followed by the search string starting with name.

    Obviously, you can pass the search string directly as a parameter instead of using SET @name_str = '...';.

    Here are some example db<>fiddle.

    reply
    0
  • Cancelreply