Home >Database >Mysql Tutorial >How to Calculate String Similarity Percentage in MySQL?

How to Calculate String Similarity Percentage in MySQL?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-30 17:59:09222browse

How to Calculate String Similarity Percentage in MySQL?

How to Calculate String Similarity in MySQL

Problem:

You have two strings in MySQL and need to determine their similarity percentage. For example, given the strings "@a = 'Welcome to Stack Overflow'" and "@b = 'Hello to stack overflow'", you want to find the similarity between them.

Solution:

  1. Create the Levenshtein Distance Function:

    Use the following function to calculate the Levenshtein distance between two strings:

    CREATE FUNCTION `levenshtein`(s1 text, s2 text) RETURNS int(11)
    DETERMINISTIC
    BEGIN 
    ...
    END

    The above function is adapted from the one provided at http://www.artfulsoftware.com/infotree/queries.php#552.

  2. Create the Levenshtein Similarity Ratio Function:

    To convert the Levenshtein distance into a similarity ratio, use this function:

    CREATE FUNCTION `levenshtein_ratio`( s1 text, s2 text ) RETURNS int(11)
    DETERMINISTIC
    BEGIN 
    ...
    END

Usage:

To calculate the similarity percentage between two strings, use the following formula:

similarity_percentage = ((1 - LEVENSHTEIN(s1, s2) / MAX_LENGTH) * 100)
  • LEVENSHTEIN(s1, s2): Calculates the Levenshtein distance between the two strings.
  • MAX_LENGTH: The maximum length of the two strings.

Example:

SELECT levenshtein_ratio('Welcome to Stack Overflow', 'Hello to stack overflow') AS similarity;

This query will return the similarity percentage between the two strings, which in this case would be 66%.

The above is the detailed content of How to Calculate String Similarity Percentage in MySQL?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn