search

Home  >  Q&A  >  body text

hashmap - 在C++中,如何对中文的字符串做哈希映射,使得所产生的冲突尽可能的少?

都是一些十个汉字以内的字符串,如何设置哈希函数呢?

阿神阿神2886 days ago595

reply all(2)I'll reply

  • 高洛峰

    高洛峰2017-04-17 11:26:36

    I personally recommend two articles:
    https://www.byvoid.com/blog/string-hash-compare
    http://blog.csdn.net/icefireelf/article/details/5796529
    Can you treat a Chinese character (wide character) as several ASCII characters and apply these algorithms?

    reply
    0
  • 大家讲道理

    大家讲道理2017-04-17 11:26:36

    Ten Chinese characters, if encoded in GB2312, are 20 bytes. If you directly use these 20 bytes as a "Hash value", there will be no conflict.
    By the way, the length of SHA1 is also 160bit, which is 20 bytes. SHA512 is even longer, so it’s better not to use it

    reply
    0
  • Cancelreply