首頁 >web前端 >js教程 >哈希表:碰撞、調整大小、哈希

哈希表:碰撞、調整大小、哈希

WBOY
WBOY原創
2024-08-17 08:50:02844瀏覽

Hash Tables : Collisions, Resizing, Hashing

假設理解 Big O 表示法。 JavaScript 中有範例。資料參考 Gayle Laakmann McDowell 的《Cracking the Coding Interview》

了解哈希表

無論您聽說過字典、哈希映射還是哈希表,它們本質上都是相同的。在本部落格中,為了簡單起見,我們將將此資料結構引用為雜湊表。

讓我們先從定義什麼是雜湊表開始。 雜湊表是一種資料結構,它以鍵值對的形式將鍵映射到值,以實現高效查找。有多種方法可以實現它。


哈希表的關鍵組成部分

使用鍊錶數組雜湊函數我們可以實作一個雜湊表。讓我們更深入地了解什麼是哈希函數。

什麼是哈希函數?

雜湊函數是雜湊表的重要組成部分。它是一種演算法,通常採用函數的形式,接受輸入(或“鍵”)並傳回固定大小的位元組字串(通常採用整數的形式)。輸出稱為雜湊碼或簡稱雜湊。

散列表中散列函數的主要目的是將散列碼對應到桶/槽數組的有效索引,從中可以找到所需的值。在我們的例子中,這些桶子/槽將是連結列表。

良好雜湊函數的特性:

  • 確定性:對於給定的輸入,它應該始終產生相同的雜湊輸出。
  • 均勻分佈:它應該在其輸出範圍內盡可能均勻地映射預期輸入。
  • 高效率:計算速度應該很快。
  • 雪崩效應:輸入的微小變化應該會導致雜湊輸出顯著不同。

為什麼使用鍊錶數組?

在雜湊表實作中使用鍊錶數組是一種常見技術,稱為連結。這種方法有幾個優點:

  1. 碰撞處理:使用鍊錶陣列的主要原因是為了有效地處理碰撞。當兩個鍵散列並對應到同一個索引時,我們可以簡單地將新的鍵值對新增到該索引處的鍊錶中。
  2. 空間效率:它允許雜湊表儲存比底層數組大小更多的項目。每個數組槽可以透過其鍊錶保存多個項目。
Array:  [0] -> (key1, value1) -> (key2, value2)
        [1] -> (key3, value3)
        [2] -> (key4, value4) -> (key5, value5) -> (key6, value6)
        [3] -> Empty
        [4] -> (key7, value7)

在此範例中,鍵 1 和 2 哈希到索引 0,而鍵 4、5 和 6 都哈希到索引 2。


插入鍵值對

現在我們已經很好地理解了雜湊函數以及為什麼使用鍊式技術,讓我們來看看將鍵值對插入雜湊表的流程:

  1. 插入鍵(任何值)時,我們先計算鍵的雜湊碼(通常是int或long)。 兩個不同的鍵可能有相同的雜湊碼,因為可能有無限的鍵和有限的整數。

  2. 將雜湊碼對應到陣列中的索引。將雜湊碼對應到陣列的常見方法是使用模運算子。 (例如,hash(key) % array.length))。 使用此方法,兩個不同的雜湊碼可能會對應到相同索引。

  3. 在索引處,有一個鍵和值的連結清單。將鍵值對儲存在此索引處。當鍵具有相同的雜湊碼或雜湊碼映射到相同的索引時,就會發生衝突。

訪問鍵值對

在雜湊表實作中存取鍵值對非常有效率。只需根據鍵計算雜湊碼,然後根據雜湊碼計算索引,最後在鍊錶中搜尋具有該鍵的值。

假設實作良好,存取鍵值對(插入和刪除也是如此),需要 O(1)1)O(1) O(1)

.

What Makes a Hash Table Implementation "Good"?

A well-implemented hash table should balance efficiency, space utilization, and collision handling. Here are the key factors that contribute to a good hash table implementation:

A Good Hash Function

The heart of any hash table is its hash function. A good hash function should:

  • Be quick to compute
  • Minimize collisions

Optimal Load Factor

The load factor is the ratio of filled slots to total slots in the hash table. Maintaining an appropriate load factor is crucial:

A typical sweet spot is between 0.6 and 0.75

  • Too low (< 0.5): Wastes space
  • Too high (> 0.8): Increases collision risk

Collision Resolution Techniques

Two primary methods for handling collisions are:

  1. Chaining: Each table position stores a linked list of collided items. Simple to implement but can lead to slower lookups if chains become long.

  2. Open Addressing: If a collision occurs, look for the next available slot. Keeps all data in the table but requires careful implementation to avoid clustering of stored data.

Note that chaining and open-addressing cannot coexist easily. Logically, it would not make sense to look for the next available slot but store collided items at a specific index.

Dynamic Resizing

As the number of elements grows, the hash table should resize to maintain performance:

Typically, the table size is doubled when the load factor exceeds a threshold. All elements need to be rehashed into the new, larger table.

This operation is expensive but infrequent, keeping the amortized time complexity at O(1).


JavaScript Implementation

This implementation will utilize resizing and chaining for collision resolution. We will assume that our keys are integers.

For the hash function + mapping, we will keep it very simple and simply perform the following given a key:

key%arraycapacitykey \hspace{0.2cm} \% \hspace{0.2cm} array \hspace{0.1cm} capacitykey%arraycapacity

Classical OOP

class HashNode {
    constructor(key, value) {
        this.key = key;
        this.value = value;
        this.next = null;
    }
}

class HashTable {
    constructor(capacity = 16) {
        this.capacity = capacity;
        this.size = 0;
        this.buckets = new Array(this.capacity).fill(null);
        this.threshold = 0.75;
    }

    hash(key) {
        return key % this.capacity;
    }

    insert(key, value) {
        const index = this.hash(key);
        if (!this.buckets[index]) {
            this.buckets[index] = new HashNode(key, value);
            this.size++;
        } else {
            let currentNode = this.buckets[index];
            while (currentNode.next) {
                if (currentNode.key === key) {
                    currentNode.value = value;
                    return;
                }
                currentNode = currentNode.next;
            }
            if (currentNode.key === key) {
                currentNode.value = value;
            } else {
                currentNode.next = new HashNode(key, value);
                this.size++;
            }
        }

        if (this.size / this.capacity >= this.threshold) {
            this.resize();
        }
    }

    get(key) {
        const index = this.hash(key);
        let currentNode = this.buckets[index];
        while (currentNode) {
            if (currentNode.key === key) {
                return currentNode.value;
            }
            currentNode = currentNode.next;
        }
        return undefined;
    }

    remove(key) {
        const index = this.hash(key);
        if (!this.buckets[index]) {
            return false;
        }
        if (this.buckets[index].key === key) {
            this.buckets[index] = this.buckets[index].next;
            this.size--;
            return true;
        }
        let currentNode = this.buckets[index];
        while (currentNode.next) {
            if (currentNode.next.key === key) {
                currentNode.next = currentNode.next.next;
                this.size--;
                return true;
            }
            currentNode = currentNode.next;
        }
        return false;
    }

    resize() {
        const newCapacity = this.capacity * 2;
        const newBuckets = new Array(newCapacity).fill(null);
        this.buckets.forEach(head => {
            while (head) {
                const newIndex = head.key % newCapacity;
                const next = head.next;
                head.next = newBuckets[newIndex];
                newBuckets[newIndex] = head;
                head = next;
            }
        });
        this.buckets = newBuckets;
        this.capacity = newCapacity;
    }

    getSize() {
        return this.size;
    }

    getCapacity() {
        return this.capacity;
    }
}

Functional OOP

function createHashTable(initialCapacity = 16) {
    let capacity = initialCapacity;
    let size = 0;
    let buckets = new Array(capacity).fill(null);
    const threshold = 0.75;

    function hash(key) {
        return key % capacity;
    }

    function resize() {
        const newCapacity = capacity * 2;
        const newBuckets = new Array(newCapacity).fill(null);
        buckets.forEach(function(head) {
            while (head) {
                const newIndex = head.key % newCapacity;
                const next = head.next;
                head.next = newBuckets[newIndex];
                newBuckets[newIndex] = head;
                head = next;
            }
        });
        buckets = newBuckets;
        capacity = newCapacity;
    }

    return {
        insert: function(key, value) {
            const index = hash(key);
            const newNode = { key, value, next: null };

            if (!buckets[index]) {
                buckets[index] = newNode;
                size++;
            } else {
                let currentNode = buckets[index];
                while (currentNode.next) {
                    if (currentNode.key === key) {
                        currentNode.value = value;
                        return;
                    }
                    currentNode = currentNode.next;
                }
                if (currentNode.key === key) {
                    currentNode.value = value;
                } else {
                    currentNode.next = newNode;
                    size++;
                }
            }

            if (size / capacity >= threshold) {
                resize();
            }
        },

        get: function(key) {
            const index = hash(key);
            let currentNode = buckets[index];
            while (currentNode) {
                if (currentNode.key === key) {
                    return currentNode.value;
                }
                currentNode = currentNode.next;
            }
            return undefined;
        },

        remove: function(key) {
            const index = hash(key);
            if (!buckets[index]) {
                return false;
            }
            if (buckets[index].key === key) {
                buckets[index] = buckets[index].next;
                size--;
                return true;
            }
            let currentNode = buckets[index];
            while (currentNode.next) {
                if (currentNode.next.key === key) {
                    currentNode.next = currentNode.next.next;
                    size--;
                    return true;
                }
                currentNode = currentNode.next;
            }
            return false;
        },

        getSize: function() {
            return size;
        },

        getCapacity: function() {
            return capacity;
        }
    };
}

以上是哈希表:碰撞、調整大小、哈希的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn