Home >Java >javaTutorial >Java data structure HashMap source code analysis

Java data structure HashMap source code analysis

WBOY
WBOYforward
2023-05-24 16:13:061485browse

HashMap is a data structure commonly used in the Java collection framework. It is a mapping table based on a hash table. In the JDK1.8 version, the implementation of the get method and put method of HashMap is somewhat different from previous versions. , let’s gradually analyze its source code implementation.

Basic structure

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {
    // ... 
    /**
     * 默认初始容量为16
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    /**
     * 默认负载因子为0.75
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    /**
     * 最大容量:1 << 30(2的30次方)
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;
    /**
     * 存放元素的数组,长度总是2的幂次方
     */
    transient HashMap.Node<K,V>[] table;
    /**
     * 存放键值对的数量
     */
    transient int size;
    /**
     * 扩容操作的阈值
     */
    int threshold;
    /**
     * 负载因子,用于计算阈值
     */
    final float loadFactor;
	// ...   
}

get method

    /**
     * 根据key获取value,如果key不存在则返回null
     *
     * @param key
     * @return
     */
    public V get(Object key) {
        // 获取key对应的Node节点
        HashMap.Node<K, V> e;
        // 调用getNode方法查找key对应的Node节点,并将查找结果赋值给e
        // 如果e为null就返回null否则返回e节点的value
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    /**
     * 根据key的哈希值和key查找对应的Node节点
     *
     * @param hash
     * @param key
     * @return
     */
    final HashMap.Node<K, V> getNode(int hash, Object key) {
        // 定义局部变量tab,first,e,n和k
        HashMap.Node<K, V>[] tab;
        HashMap.Node<K, V> first, e;
        int n;
        K k;
        // 如果table数据不为null且长度大于0,且第一个Node节点不为空,则开始查找Node节点
        if ((tab = table) != null && (n = tab.length) > 0 &&
                (first = tab[(n - 1) & hash]) != null) {
            // 如果第一个Node节点的哈希值与传入的hash值相等,且第一个Node节点的key和传入的key相等,则直接返回第一个Node节点
            if (first.hash == hash && // always check first node
                    ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            // 如果第一个Node节点不是要查找的Node节点,则开始遍历链表查找对应的Node节点
            if ((e = first.next) != null) {
                if (first instanceof HashMap.TreeNode)
                    // 如果第一个Node节点是红黑树节点,则调用红黑树节点的getTreeNode方法查找对应的Node节点
                    return ((HashMap.TreeNode<K, V>) first).getTreeNode(hash, key);
                // 如果第一个Node节点不是红黑树节点,则遍历链表查找对应的Node节点
                do {
                    // 如果遍历到的Node节点的hash值与传入的hash值相等,且Node节点的key和传入的key相等,则返回对应的Node节点
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        // 如果在table数组中没有找到对应的Node节点,则返回null
        return null;
    }

The work flow of the get method is as follows:

  • Calculate the position in the hash table based on the hashCode of the key

  • Traverse the linked list or tree at the position and find the corresponding key-value pair

  • If the corresponding key-value pair is found, the corresponding value is returned; otherwise null is returned

put method

    /**
     * 向HashMap中添加一个key-value键值对
     *
     * @param key
     * @param value
     * @return
     */
    public V put(K key, V value) {
        // 根据key的哈希值和key查找对应的Node节点,并添加到HashMap中
        return putVal(hash(key), key, value, false, true);
    }
    /**
     * 根据key的hash值和key添加一个键值对到HashMap中
     *
     * @param hash
     * @param key
     * @param value
     * @param onlyIfAbsent
     * @param evict
     * @return
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        // 定义局部变量tab,p,n和i
        HashMap.Node<K, V>[] tab;
        HashMap.Node<K, V> p;
        int n, i;
        // 如果table数组为null或者长度为0,则先调用resize()方法初始化table数组
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 根据计算出来插入位置i插入新的键值对
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 如果插入的位置为null,则直接插入新的键值对
            tab[i] = newNode(hash, key, value, null);
        else {
            HashMap.Node<K, V> e;
            K k;
            // 如果插入的位置不为null,就遍历链表或树查找插入位置
            if (p.hash == hash &&
                    ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof HashMap.TreeNode)
                // 如果插入位置为红黑树节点,则调用putTreeVal方法插入新的键值对
                e = ((HashMap.TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
            else {
                // 遍历链表,查找插入位置
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // 直接在链表末尾插入新的键值对
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            // 如果此时链表长度大于等于8,则将链表转化为红黑树
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 如果找到相同key,终止循环
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                // 如果存在相同key,则替换对应value
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            // 如果插入后的HashMap的大小大于阈值,则调用resize方法扩容HashMap
            resize();
        afterNodeInsertion(evict);
        return null;
    }

The work flow of the put method is as follows:

  • Calculate the position in the hash table based on the hashCode value of the key

  • If the position is empty, insert the new key value directly For

  • If the position is not empty, traverse the linked list or tree at the position to find whether the corresponding key-value pair already exists

  • If the corresponding key-value pair is found, replace the corresponding value

  • If the corresponding key-value pair is not found, insert the new key-value pair into the end of the linked list

  • If the length of the linked list reaches the threshold (default is 8), convert the linked list into a tree

  • If the size of the HashMap after insertion exceeds the threshold (the default capacity is 0.75), then expand the HashMap

  • After the insertion is completed, perform some necessary follow-up operations, such as updating the number of modifications, etc.

In general It is said that the get method and put method of HashMap are based on the hash algorithm to realize the search and insertion of key-value pairs. The put method needs to consider more situations, including converting the linked list into a tree, expanding the capacity, etc.

Why the capacity of HashMap is always 2 to the n power

In Java, the reason why the capacity of HashMap is always 2 to the n power is to improve the performance of HashMap.

HashMap internals Use an array to store key-value pairs. When adding a key-value pair, HashMap will calculate its index position in the array based on the created hashCode value. If the length of the array is not the n power of 2, then it is necessary to calculate the index. Perform a modulo operation, which will affect the performance of HashMap.

If the array length is 2 to the n power, then bit operations (& operations) can be used when calculating the index, which is faster than the modulo operation. And , The expansion operation of HashMap also requires the length to be the nth power of 2, which can simplify calculations and improve performance during expansion.

In addition, another advantage of the array size with a length of 2 to the nth power is that, It can ensure that the probability of hash conflicts in different positions of the array is relatively even, which can reduce the occurrence of hash conflicts and improve the efficiency of HashMap.

The above is the detailed content of Java data structure HashMap source code analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete