Home > Article > Java > Java Improvement Chapter (33)-----Map Summary

Java Improvement Chapter (33)-----Map Summary

黄舟Original: 2017-02-11 10:24:081423browse

LZ introduced the implementation methods of HashMap, HashTable, and TreeMap in detail from the three aspects of data structure, implementation principle, and source code analysis. You should have a clearer understanding of these three classes. Below LZ makes a brief summary of Map.

3) —-HashMap

## Java Improvement Chapter (25)—–HashTable

Java Improvement Chapter (26)-----hashCode

# first look at the structure diagram of the map

Map:

Abstract interface for "key-value" pair mapping. The map does not include duplicate keys, one key corresponds to one value.

SortedMap:
Ordered key-value pair interface, inherits the Map interface.

NavigableMap: Inherits SortedMap and has a navigation method that returns the closest match for a given search target Interface.

AbstractMap: Implements most of the function interfaces in Map. It reduces the repeated coding of "Map implementation class".

Dictionary: The abstract parent class of any class that can map keys to corresponding values. Currently replaced by the Map interface.

TreeMap: Ordered hash table, implements the SortedMap interface, and the bottom layer is implemented through a red-black tree.

HashMap: is a hash table based on the "zipper method". The bottom layer is implemented using "array + linked list".

WeakHashMap: A hash table based on the "zipper method".

HashTable: Hash table based on the "zipper method".

Summarized as follows:

The difference between them:

2. Internal hash: Hash mapping technology

For us programmers we must understand it.

Hash mapping technology is a very simple technology that maps elements to arrays. Since the hash map uses an array result, there must be an indexing mechanism for determining access to the array by any key. This mechanism can provide an integer smaller than the size of the array. We call this mechanism a hash function. In Java we don't have to worry about finding such an integer, because every object must have a hashCode method that returns an integer value, and all we need to do is convert it to an integer and then divide the value by the array Just take the remainder from the size. As follows:

int hashValue = Maths.abs(obj.hashCode()) % size;

The following are HashMap and HashTable:

----------HashMap------------
//计算hash值
static int hash(int h) {
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}
//计算key的索引位置
static int indexFor(int h, int length) {
        return h & (length-1);
}
-----HashTable--------------
int index = (hash & 0x7FFFFFFF) % tab.length;     //确认该key的索引位置

The index of the position represents the position of the node in the array. The following figure is the basic principle diagram of hash mapping:

## In this figure, steps 1-4 are to find the element in the array Position, steps 5-8 are to insert the element into the array. There will be a little frustration during the insertion process. There may be multiple elements with the same hash value, so they will get the same index position, which means multiple elements will be mapped to the same position. This process is called "conflict". The way to resolve the conflict is to insert a linked list at the index position and simply add the element to this linked list. Of course, it is not a simple insertion. The processing process in HashMap is as follows: Get the linked list at the index position. If the linked list is null, insert the element directly. Otherwise, compare whether there is a key that is the same as the key. If it exists, overwrite it. The value of the original key and returns the old value, otherwise the element is saved at the head of the chain (the first element saved is placed at the end of the chain). The following is the put method of HashMap, which shows in detail the entire process of calculating the index position and inserting the element into the appropriate position:

public V put(K key, V value) {
        //当key为null，调用putForNullKey方法，保存null与table第一个位置中，这是HashMap允许为null的原因
        if (key == null)
            return putForNullKey(value);
        //计算key的hash值
        int hash = hash(key.hashCode());                 
        //计算key hash 值在 table 数组中的位置
        int i = indexFor(hash, table.length);            
        //从i出开始迭代 e,判断是否存在相同的key
        for (Entry<K, V> e = table[i]; e != null; e = e.next) {
            Object k;
            //判断该条链上是否有hash值相同的(key相同)
            //若存在相同，则直接覆盖value，返回旧value
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;    //旧值 = 新值
                e.value = value;
                e.recordAccess(this);
                return oldValue;     //返回旧值
            }
        }
        //修改次数增加1
        modCount++;
        //将key、value添加至i位置处
        addEntry(hash, key, value, i);
        return null;
    }

The put method of HashMap shows the basic idea of hash mapping. In fact, if we look at other Maps, we find that the principles are similar!

# 3, MAP Optimization

##First of all, we assume this The size is only 1, and all elements will map to this position (0), thus forming a longer linked list. Since we need to perform a linear search on this linked list when updating and accessing, this will inevitably reduce efficiency. We assume that if there is a very large array and there is only one element at each position in the linked list, the object will be obtained by calculating its index value when accessing. Although this will improve the efficiency of our search, it will waste control. Admittedly, although these two methods are extreme, they provide us with an optimization idea: Use a larger array so that the elements can be evenly distributed. There are two factors in Map that will affect its efficiency. One is the initial size of the container and the other is the load factor. 3.1. Adjust the implementation size

In the hash map table, each position in the internal array is called As a "bucket", the number of available buckets (that is, the size of the internal array) is called capacity. In order to enable the Map object to effectively handle any number of elements, we designed the Map to be able to adjust itself the size of. We know that when the elements in the Map reach a certain amount, the container itself will be resized, but this resizing process is very expensive. Resizing requires inserting all original elements into the new array. We know index = hash(key) % length. This may cause the original conflicting keys to no longer conflict, and the non-conflicting keys to now conflict. The process of recalculation, adjustment, and insertion is very expensive and inefficient. So,

If we start to know the expected size value of the Map and adjust the Map to be large enough, we can greatly reduce or even eliminate the need to resize, which is likely to improve the speed. The following is the process of HashMap adjusting the container size. Through the following code we can see the complexity of its expansion process:

void resize(int newCapacity) {
            Entry[] oldTable = table;    //原始容器
            int oldCapacity = oldTable.length;    //原始容器大小
            if (oldCapacity == MAXIMUM_CAPACITY) {     //是否超过最大值：1073741824
                threshold = Integer.MAX_VALUE;
                return;
            }

            //新的数组：大小为 oldCapacity * 2
            Entry[] newTable = new Entry[newCapacity];    
            transfer(newTable, initHashSeedAsNeeded(newCapacity));
            table = newTable;
           /*
            * 重新计算阀值 =  newCapacity * loadFactor >  MAXIMUM_CAPACITY + 1 ? 
            *                         newCapacity * loadFactor :MAXIMUM_CAPACITY + 1
            */
            threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);   
        }
        
        //将元素插入到新数组中
        void transfer(Entry[] newTable, boolean rehash) {
            int newCapacity = newTable.length;
            for (Entry<K,V> e : table) {
                while(null != e) {
                    Entry<K,V> next = e.next;
                    if (rehash) {
                        e.hash = null == e.key ? 0 : hash(e.key);
                    }
                    int i = indexFor(e.hash, newCapacity);
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                }
            }
        }

3.2、负载因子

为了确认何时需要调整Map容器，Map使用了一个额外的参数并且粗略计算存储容器的密度。在Map调整大小之前，使用”负载因子”来指示Map将会承担的“负载量”，也就是它的负载程度，当容器中元素的数量达到了这个“负载量”，则Map将会进行扩容操作。负载因子、容量、Map大小之间的关系如下：负载因子 * 容量 > map大小 ----->调整Map大小。

例如：如果负载因子大小为0.75（HashMap的默认值），默认容量为11，则 11 * 0.75 = 8.25 = 8，所以当我们容器中插入第八个元素的时候，Map就会调整大小。

负载因子本身就是在控件和时间之间的折衷。当我使用较小的负载因子时，虽然降低了冲突的可能性，使得单个链表的长度减小了，加快了访问和更新的速度，但是它占用了更多的控件，使得数组中的大部分控件没有得到利用，元素分布比较稀疏，同时由于Map频繁的调整大小，可能会降低性能。但是如果负载因子过大，会使得元素分布比较紧凑，导致产生冲突的可能性加大，从而访问、更新速度较慢。所以我们一般推荐不更改负载因子的值，采用默认值0.75.

Java Improvement Chapter (33)-----Map Summary

Map:

2. Internal hash: Hash mapping technology

3.2、负载因子

最后

Related articles