Home > Article > Backend Development > Introduction to Wu-Manber algorithm and Python implementation instructions
The Wu-Manber algorithm is a string matching algorithm used to search strings efficiently. It is a hybrid algorithm that combines the advantages of Boyer-Moore and Knuth-Morris-Pratt algorithms to provide fast and accurate pattern matching.
#1. Create a hash table that maps every possible substring of the pattern to that substring The pattern position at which the string occurs.
2. This hash table is used to quickly identify potential starting locations of patterns in text.
3. Loop through the text and compare each character to the corresponding character in the pattern.
4. If the characters match, you can move to the next character and continue the comparison.
5. If the characters do not match, a hash table can be used to determine the maximum number of characters that can be skipped before the next potential starting position of the pattern.
6. This allows the algorithm to quickly skip large portions of text without missing any potential matches.
# Define the hash_pattern() function to generate # a hash for each subpattern def hashPattern(pattern, i, j): h = 0 for k in range(i, j): h = h * 256 + ord(pattern[k]) return h # Define the Wu Manber algorithm def wuManber(text, pattern): # Define the length of the pattern and # text m = len(pattern) n = len(text) # Define the number of subpatterns to use s = 2 # Define the length of each subpattern t = m // s # Initialize the hash values for each # subpattern h = [0] * s for i in range(s): h[i] = hashPattern(pattern, i * t, (i + 1) * t) # Initialize the shift value for each # subpattern shift = [0] * s for i in range(s): shift[i] = t * (s - i - 1) # Initialize the match value match = False # Iterate through the text for i in range(n - m + 1): # Check if the subpatterns match for j in range(s): if hashPattern(text, i + j * t, i + (j + 1) * t) != h[j]: break else: # If the subpatterns match, check if # the full pattern matches if text[i:i + m] == pattern: print("Match found at index", i) match = True # Shift the pattern by the appropriate # amount for j in range(s): if i + shift[j] < n - m + 1: break else: i += shift[j] # If no match was found, print a message if not match: print("No match found") # Driver Code text = "the cat sat on the mat" pattern = "the" # Function call wuManber(text, pattern)
KMP algorithm and Wu Manber algorithm are both string matching algorithms, that is to say, they are used to find a substring in a larger string. Both algorithms have the same time complexity, which means they have the same performance characteristics in terms of the time it takes for the algorithm to run.
However, there are some differences between them:
1. The KMP algorithm uses a preprocessing step to generate a partial matching table, which is used to speed up character String matching process. This makes the KMP algorithm more efficient than the Wu Manber algorithm when the pattern being searched is relatively long.
2. Wu Manber algorithm uses a different method for string matching. It divides the pattern into multiple sub-patterns and uses these sub-patterns to search for matches in the text. This makes the Wu Manber algorithm more efficient than the KMP algorithm when the pattern being searched is relatively short.
The above is the detailed content of Introduction to Wu-Manber algorithm and Python implementation instructions. For more information, please follow other related articles on the PHP Chinese website!