Home  >  Article  >  Backend Development  >  Introduction to Wu-Manber algorithm and Python implementation instructions

Introduction to Wu-Manber algorithm and Python implementation instructions

王林
王林forward
2024-01-23 19:03:051317browse

Introduction to Wu-Manber algorithm and Python implementation instructions

The Wu-Manber algorithm is a string matching algorithm used to search strings efficiently. It is a hybrid algorithm that combines the advantages of Boyer-Moore and Knuth-Morris-Pratt algorithms to provide fast and accurate pattern matching.

Wu-Manber algorithm steps

#1. Create a hash table that maps every possible substring of the pattern to that substring The pattern position at which the string occurs.

2. This hash table is used to quickly identify potential starting locations of patterns in text.

3. Loop through the text and compare each character to the corresponding character in the pattern.

4. If the characters match, you can move to the next character and continue the comparison.

5. If the characters do not match, a hash table can be used to determine the maximum number of characters that can be skipped before the next potential starting position of the pattern.

6. This allows the algorithm to quickly skip large portions of text without missing any potential matches.

Python implements Wu-Manber algorithm

# Define the hash_pattern() function to generate
# a hash for each subpattern


def hashPattern(pattern, i, j):
h = 0
for k in range(i, j):
h = h * 256 + ord(pattern[k])
return h

# Define the Wu Manber algorithm


def wuManber(text, pattern):

# Define the length of the pattern and
# text
m = len(pattern)
n = len(text)

# Define the number of subpatterns to use
s = 2

# Define the length of each subpattern
t = m // s

# Initialize the hash values for each
# subpattern
h = [0] * s
for i in range(s):
h[i] = hashPattern(pattern, i * t, (i + 1) * t)

# Initialize the shift value for each
# subpattern
shift = [0] * s
for i in range(s):
shift[i] = t * (s - i - 1)

# Initialize the match value
match = False

# Iterate through the text
for i in range(n - m + 1):
# Check if the subpatterns match
for j in range(s):
if hashPattern(text, i + j * t, i + (j + 1) * t) != h[j]:
break
else:
# If the subpatterns match, check if
# the full pattern matches
if text[i:i + m] == pattern:
print("Match found at index", i)
match = True

# Shift the pattern by the appropriate
# amount
for j in range(s):
if i + shift[j] < n - m + 1:
break
else:
i += shift[j]

# If no match was found, print a message
if not match:
print("No match found")


# Driver Code
text = "the cat sat on the mat"
pattern = "the"

# Function call
wuManber(text, pattern)

The difference between KMP and Wu-Manber algorithm

KMP algorithm and Wu Manber algorithm are both string matching algorithms, that is to say, they are used to find a substring in a larger string. Both algorithms have the same time complexity, which means they have the same performance characteristics in terms of the time it takes for the algorithm to run.

However, there are some differences between them:

1. The KMP algorithm uses a preprocessing step to generate a partial matching table, which is used to speed up character String matching process. This makes the KMP algorithm more efficient than the Wu Manber algorithm when the pattern being searched is relatively long.

2. Wu Manber algorithm uses a different method for string matching. It divides the pattern into multiple sub-patterns and uses these sub-patterns to search for matches in the text. This makes the Wu Manber algorithm more efficient than the KMP algorithm when the pattern being searched is relatively short.

The above is the detailed content of Introduction to Wu-Manber algorithm and Python implementation instructions. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:163.com. If there is any infringement, please contact admin@php.cn delete