Home >Backend Development >C++ >Using hash table to implement string search in C++

Using hash table to implement string search in C++

PHPz
PHPzOriginal
2023-08-22 12:03:181212browse

Using hash table to implement string search in C++

Hash table is a very common data structure that maps key values ​​into a fixed-size table, allowing efficient search, insertion, and deletion operations. In C, we can use unordered_map in STL (Standard Template Library) to implement a hash table.

In practical applications, it is often necessary to perform search operations on strings. For example, find the number of occurrences of a certain keyword in a text or find all lines that contain a certain string. To accomplish these tasks efficiently, string lookups can be implemented using hash tables.

In this article, we will introduce the specific method of using hash table to implement string search in C. We will use the example of finding the number of times a string appears in a text.

First, we need to define a function to map strings into a hash table. A common method is to use the hash value of the string as the key value, thus ensuring that different strings map to different locations. In order for a hash function to have good performance, it needs to be calculated quickly and the occurrence of hash conflicts should be minimized.

The following is a simple hash function implementation that adds the ASCII codes of the strings and takes the remainder:

size_t hash_func(const std::string& str) {
    size_t hash_val = 0;
    for (char c : str) {
        hash_val += static_cast<size_t>(c);
    }
    return hash_val % MAP_SIZE;
}

Next, we need to insert each word in the text into in the hash table. We can insert the text into a hash table by splitting it into words by spaces and calling a hash function. Since a keyword may appear multiple times, we need to record the number of times each keyword appears. We can use unordered_map to achieve this function. If the key value already exists during insertion, the value will be incremented:

std::unordered_map<std::string, size_t> word_map;
for (std::string word : words) {
    if (word_map.find(word) == word_map.end()) {
        word_map[word] = 1;
    } else {
        ++word_map[word];
    }
}

Finally, we can get it by calling the value corresponding to the string in the hash table. Number of occurrences in text:

size_t count = word_map["target_string"];

The complete code is as follows:

#include 
#include 
#include 
#include 

const size_t MAP_SIZE = 1024;

size_t hash_func(const std::string& str) {
    size_t hash_val = 0;
    for (char c : str) {
        hash_val += static_cast<size_t>(c);
    }
    return hash_val % MAP_SIZE;
}

int main() {
    std::vector words {"hello", "world", "hello", "c++", "hash", "world", "world"};
    std::unordered_map word_map;

    for (std::string word : words) {
        if (word_map.find(word) == word_map.end()) {
            word_map[word] = 1;
        } else {
            ++word_map[word];
        }
    }

    size_t count = word_map["world"];
    std::cout << "The word 'world' appears " << count << " times." << std::endl;

    return 0;
}

With the above code, we can use a hash table to quickly count the number of times a string appears in a text. The use of hash tables can improve search performance, which is more obvious for large amounts of data. It also has great flexibility and scalability in practical applications.

The above is the detailed content of Using hash table to implement string search in C++. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn