Home >Backend Development >Golang >Murmur3 hash compatibility between Go and Python

Murmur3 hash compatibility between Go and Python

王林
王林forward
2024-02-09 13:10:191240browse

Go 和 Python 之间的 Murmur3 哈希兼容性

php editor Zimo introduces you to the Murmur3 hash compatibility between Go and Python. Murmur3 is an efficient hash algorithm commonly used for hash operations in data structures and algorithms. The Murmur3 hashing algorithm is implemented differently in the two programming languages ​​Go and Python, so compatibility issues may arise when using it. This article will detail the differences in the Murmur3 hashing algorithm in Go and Python and provide solutions to ensure correct hash compatibility when passing data between different languages.

Question content

We have two different libraries, one in python and one in go, that need to calculate murmur3 hashes in the same way. Unfortunately, no matter how hard we tried, we couldn't get the library to produce the same results. Judging from this question about java and python, compatibility is not necessarily straightforward.

Now we are using python mmh3 and go github.com/spaolacci/murmur3 libraries.

In go:

hash := murmur3.new128()
hash.write([]byte("chocolate-covered-espresso-beans"))
fmt.println(base64.rawurlencoding.encodetostring(hash.sum(nil)))
// output: clhso2ncbxyoezvilm5gwg

In python:

name = "chocolate-covered-espresso-beans"
hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='big', signed=False)
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# Output: jns74izOYMJwsdKjacIHHA (big byteorder)

hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='little', signed=False)
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# Output: HAfCaaPSsXDCYM4s4jt7jg (little byteorder)

hash = mmh3.hash_bytes(name.encode('utf-8'))
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# Output: HAfCaaPSsXDCYM4s4jt7jg

In go, murmur3 returns a uint64, so we assume signed=false in python; but we also tried signed= true did not get a matching hash value.

We are open to different libraries, but would like to know if there is an issue with our go or python approach to computing a base64 encoded hash from a string. Any help is appreciated.

Solution

The first python result is almost correct.

>>> binascii.hexlify(base64.b64decode('jns74izoymjwsdkjacihha=='))
b'8e7b3be22cce60c270b1d2a369c2071c'

In go:

    x, y := murmur3.sum128([]byte("chocolate-covered-espresso-beans"))
    fmt.printf("%x %x\n", x, y)

result:

70b1d2a369c2071c 8e7b3be22cce60c2

So the order of these two words is reversed. To get the same result in python you can try the following:

name = "chocolate-covered-espresso-beans"
hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='big', signed=False)
hash = hash[8:] + hash[:8]
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# cLHSo2nCBxyOezviLM5gwg

The above is the detailed content of Murmur3 hash compatibility between Go and Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete