How to match strings with appended parts, but not match them if they have different appended parts? For example, if I have a list of names and need to match them to names in my database:
| My DataBase | Foreign table | | David James Malan Neeko | David James Malan | | David James Malan Neeko | David James Malan Mathew | | David James Malan Neeko | David jam Mlan |
I've tried doing an exact match first, and if nothing is found, I keep removing a character from the end of the external name string and try matching again until only one record matches my database.
This technique matches the first and third examples in the table above, but the problem is that it also matches the second example because the program removes one character each time until the entire Mathew word is removed, and then it is found gives a match, which is a false match in this case.
Any suggestions where I can do an exact match only or match but only have the additional part of the name on one side, but not match them if they both have two different parts of the name.
You can use SQL or Python to solve this problem.
Thanks in advance for your help.
P粉1455438722023-09-21 11:25:47
This is very inefficient and will require a full table scan.
SET @name_str = 'David James Malan Neeko'; SELECT * FROM tbl WHERE name LIKE CONCAT(@name_str, '%') OR @name_str LIKE CONCAT(name, '%') ORDER BY name = @name_str DESC, name LIKE CONCAT(@name_str, '%') DESC;The first expression in the
ORDER BY
clause will sort for exact matches first, then the second expression will sort for name
that starts with the entire search string. Sort followed by the search string starting with name
.
Obviously, you can pass the search string directly as a parameter instead of using SET @name_str = '...';
.
Here are some example db<>fiddle.