Hash (Hash), also known as hashing, is a very common algorithm. Hash is mainly used in JavaHashMap data structure. The hash algorithm consists of two parts: hash function and hash table. We can know the characteristics of our array. We can quickly locate elements through subscripts (O(1)). Similarly, in the hash table, we can use the key (hash value) To quickly locate a certain value, the hash value is calculated through the hash function (hash(key) = address). The element [address] = value can be located through the hash value. The principle is similar to that of an array.
The best hash function is of course one that can calculate a unique hash value for each key value, but there may often be different keyvalue hash value, which causes conflicts. There are two aspects to judge whether a hash function is well designed:
1. Few conflicts .
2. Calculation is fast.
The following are several commonly used hash functions. They all have certain mathematical principles behind them and have undergone a lot of practice. Their mathematical principles will not be explored here.
BKDRHash function (h = 31 * h + c)
This hash function is applied to string hash value calculation in Java.
//String#hashCodepublic int hashCode() {int h = hash;if (h == 0 && value.length > 0) {char val[] = value;for (int i = 0; i
##DJB2 Hash function (h = h )
ElasticSearchuses DJB2The hash function hashes the specified key of the document to be indexed.
##SDBMHash function (h = h ) is applied in
SDBM(a simple database engine). The above is just a list of three hash functions. Let’s do an experiment to see how they conflict.
Java
1 package com.algorithm.hash; 2 3 import java.util.HashMap; 4 import java.util.UUID; 5 6 /** 7 * 三种哈希函数冲突数比较 8 * Created by yulinfeng on 6/27/17. 9 */10 public class HashFunc {11 12 public static void main(String[] args) {13 int length = 1000000; //100万字符串14 //利用HashMap来计算冲突数,HashMap的键值不能重复所以length - map.size()即为冲突数15 HashMap<string> bkdrMap = new HashMap<string>();16 HashMap<string> djb2Map = new HashMap<string>();17 HashMap<string> sdbmMap = new HashMap<string>();18 getStr(length, bkdrMap, djb2Map, sdbmMap);19 System.out.println("BKDR哈希函数100万字符串的冲突数:" + (length - bkdrMap.size()));20 System.out.println("DJB2哈希函数100万字符串的冲突数:" + (length - djb2Map.size()));21 System.out.println("SDBM哈希函数100万字符串的冲突数:" + (length - sdbmMap.size()));22 }23 24 /**25 * 生成字符串,并计算冲突数26 * @param length27 * @param bkdrMap28 * @param djb2Map29 * @param sdbmMap30 */31 private static void getStr(int length, HashMap<string> bkdrMap, HashMap<string> djb2Map, HashMap<string> sdbmMap) {32 for (int i = 0; i <div class="cnblogs_code"> The following are </div>10<p><span style="font-family: Calibri;">万、</span><span style="font-family: SimSun;">100</span><span style="font-family: Calibri;"> Ten thousand, </span><span style="font-family: SimSun;">200</span><span style="font-family: Calibri;"> Conflict number situation: </span><span style="font-family: SimSun;"></span></p> <p><span style="font-family: SimSun;"><img src="/static/imghwm/default1.png" data-src="https://img.php.cn/upload/article/000/000/001/6d0143f1fa951707d245c36e169c2fd5-0.png?x-oss-process=image/resize,p_40" class="lazy" alt=""></span></p> <p><img src="/static/imghwm/default1.png" data-src="https://img.php.cn/upload/article/000/000/001/6d0143f1fa951707d245c36e169c2fd5-1.png?x-oss-process=image/resize,p_40" class="lazy" alt=""></p> <p><img src="/static/imghwm/default1.png" data-src="https://img.php.cn/upload/article/000/000/001/6d0143f1fa951707d245c36e169c2fd5-2.png?x-oss-process=image/resize,p_40" class="lazy" alt=""> After repeated trials, the number of collisions of the three hash functions is actually about the same. </p> <p> </p> <p>Python<span style="font-size: 18px;"><strong>3</strong><strong></strong></span></p> <pre class="brush:php;toolbar:false"> 1 import uuid 2 3 def hash_test(length, bkdrDic, djb2Dic, sdbmDic): 4 for i in range(length): 5 string = str(uuid.uuid1()) #基于时间戳 6 bkdrDic[bkdr(string)] = string 7 djb2Dic[djb2(string)] = string 8 sdbmDic[sdbm(string)] = string 9 10 #BDKR哈希函数11 def bkdr(string):12 hash = 013 for i in range(len(string)):14 hash = 31 * hash + ord(string[i]) # h = 31 * h + c15 return hash16 17 #DJB2哈希函数18 def djb2(string):19 hash = 020 for i in range(len(string)):21 hash = 33 * hash + ord(string[i]) # h = h <p> A hash table is a data structure that needs to be used with a hash function to create an index for quick search <span style="font-family: Calibri;">——</span><span style="font-family: SimSun;">"Algorithm Notes". Generally speaking, it is a fixed-length storage space. For example, </span><span style="font-family: Calibri;">HashMap</span><span style="font-family: SimSun;">The default hash table is a fixed-length hash table of </span><span style="font-family: Calibri;">16</span><span style="font-family: SimSun;"></span><span style="font-family: Calibri;">Entry</span><span style="font-family: SimSun;">Array. After having a fixed-length storage space, the remaining question is how to put the value into which position. Usually if the hash value is </span><span style="font-family: Calibri;">m</span><span style="font-family: SimSun;">, the length is </span><span style="font-family: Calibri;"> n</span><span style="font-family: SimSun;">, then this value is placed at the </span><span style="font-family: Calibri;">m mod n</span><span style="font-family: SimSun;"> position. </span></p><p> <img src="/static/imghwm/default1.png" data-src="https://img.php.cn/upload/article/000/000/001/6d0143f1fa951707d245c36e169c2fd5-3.png?x-oss-process=image/resize,p_40" class="lazy" alt=""></p><p> The picture above shows the hash and hash table, as well as the solution to the conflict (zipper method). There are many solutions to conflicts, including hashing again until there is no conflict, and also using the zipper method to use linked lists to concatenate elements at the same position as shown in the figure above. </p><p> Imagine that the length of the hash table in the above example is <span style="max-width:90%">10</span><span style="font-family: SimSun;">, resulting in </span><span style="font-family: Calibri;">1</span><span style="font-family: SimSun;"> conflicts. If the hash The table length is </span><span style="font-family: Calibri;">20</span><span style="font-family: SimSun;">, then there will be no conflict. The search is faster but will waste more space. If the hash table length is </span><span style="font-family: Calibri;">2</span><span style="font-family: SimSun;">, the conflict search will be inverted </span><span style="font-family: Calibri;">3</span><span style="font-family: SimSun;"> times slower, but this will save a lot of space. <strong>So the length selection of the hash table is crucial, but it is also an important problem. </strong></span></p><p><em> Supplement: </em></p><p><em> Hashes are used in many aspects. For example, different values have different hash values, but The hash algorithm can also be designed so that similar or identical values have similar or identical hash values. That is to say, if two objects are completely different, then their hash values are completely different; if two objects are exactly the same, then their hash values are also exactly the same; the more similar the two objects are, then their hash values are also completely different. The more similar they are. This is actually a similarity problem, which means that this idea can be generalized and applied to similarity calculations (such as the Jaccard distance problem), and ultimately applied to accurate advertising placement, product recommendation, etc. </em></p><p><em> In addition, consistent hashing can also be applied in load balancing. How to ensure that each server can evenly distribute the load pressure, a good hash algorithm can also do it. </em></p>
The above is the detailed content of Hash--Introduction to common algorithms. For more information, please follow other related articles on the PHP Chinese website!

如何使用C++中的哈希搜索算法哈希(Hash)搜索算法是一种高效的查找和存储技术,它将关键字通过哈希函数转化为一个固定长度的索引,然后利用这个索引在数据结构中进行搜索。在C++中,我们可以通过使用标准库中的哈希容器和哈希函数来实现哈希搜索算法。本文将介绍如何使用C++中的哈希搜索算法,并提供具体的代码示例。引入头文件和命名空间首先,在使用C++中的哈希搜索算

如何用Python编写哈希查找算法?哈希查找算法,又称为散列查找算法,是一种基于哈希表的数据查找方法。相比于线性查找和二分查找等传统查找算法,哈希查找算法具有更高的查找效率。在Python中,我们可以使用字典(dictionary)来实现哈希表,进而实现哈希查找。哈希查找算法的基本思想是将待查找的关键字通过哈希函数转换成一个索引值,然后根据索引值在哈希表中查

Python底层技术揭秘:如何实现哈希算法,需要具体代码示例摘要:哈希算法是计算机领域中常用的技术之一,用于快速确定数据的唯一标识。Python作为一门高级语言,提供了许多内建的哈希函数,如hash()函数以及各种散列算法的实现。本文将揭示哈希算法的原理和Python底层实现的细节,并提供具体的代码示例。哈希算法简介哈希算法,又称散列算法,是一种将任意长度的

在今天的数字时代中,随着互联网的发展和信息的日益重要,数据的保密性和安全性变得越来越重要。为了确保数据在传输过程中不被窃取或篡改,PHP开发人员通常使用加密和哈希技术来保护敏感数据。本文将介绍PHP开发中最常用的加密和哈希技术,以及它们的优缺点。一、加密技术加密是一种保护数据安全性的技术,它使用算法将数据转换为无意义的形式。只有持有密钥的人才能将其还原为可读

在了解比特币投资和区块链技术中,哈希算法可以说经常出现,币圈戏言说唱有嘻哈,算法有哈希。关于“算法”一词,目前国内用户使用的比较模糊,有时指共识机制,有时指具体的Hash算法,作为区块链算法,哈希算法一直让普通大众感到晦涩难懂,那么,什么是哈希算法?接下来币圈子小编就来给大家通俗的讲解一下哈希算法是什么?希望能够让投资者看完本文就能读懂哈希算法。什么是哈希算法?哈希音译自“Hash”,又名为“散列”。本质上是一种计算机程序,可接收任意

哈希函数是可用于将任意大小的数据映射到固定大小的数据的任何函数。哈希函数返回的值称为哈希值、哈希代码、摘要或简称为哈希。语法stringhash(string$algo,string$data[,bool$raw_output=FALSE])参数algo所选哈希算法的名称(如“md5”、“sha256”、“haval160,4”等)data要散列的消息。raw_output设置为TRUE时,输出原始二进制数据。FALSE输出小写十六进制。示例<?php  

如何处理记账系统中的哈希和加密功能-使用PHP实现哈希和加密的开发方法引言:随着数字化时代的到来,各种信息系统的安全性变得越发重要。在设计和开发记账系统时,保护用户隐私数据是至关重要的。其中,使用哈希和加密功能可以有效地保护用户的敏感信息。本文将介绍如何使用PHP实现记账系统中的哈希和加密功能,并提供具体的代码示例。一、哈希功能的实现哈希是一种单向加密算

介绍在当今快节奏且互联的数字化世界中,确保应用程序的高可用性至关重要。负载均衡技术使应用程序能够在多个服务器上分发传入流量,从而提高性能和可靠性。PHP提供了一系列负载均衡技术的支持,每种技术都具有其独特的优势和限制。轮询(RoundRobin)轮询是一种简单而有效的负载均衡技术,它将请求按顺序分发到服务器池。这种方法易于实现,并且可以保证请求在服务器之间均匀分布。$servers=array("server1","server2","server3");$index=0;while(true)


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Zend Studio 13.0.1
Powerful PHP integrated development environment

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.
