Home  >  Article  >  Backend Development  >  python counts word occurrences

python counts word occurrences

angryTom
angryTomOriginal
2020-02-13 11:09:129947browse

python counts word occurrences

Python counts the number of word occurrences

To do word frequency statistics, using dictionary is undoubtedly the most appropriate data type. The word is used as the key of the dictionary, and the number of times the word appears is used as the value of the dictionary. It is very convenient to record the frequency of each word. The dictionary is much like our phone book, and each name is associated with a phone number.

The following is the specific implementation code, which implements reading words from the importthis.txt file and counting the 5 words with the most occurrences.

# -*- coding:utf-8 -*-
import io
import re

class Counter:
    def __init__(self, path):
        """
        :param path: 文件路径
        """
        self.mapping = dict()
        with io.open(path, encoding="utf-8") as f:
            data = f.read()
            words = [s.lower() for s in re.findall("\w+", data)]
            for word in words:
                self.mapping[word] = self.mapping.get(word, 0) + 1

    def most_common(self, n):
        assert n > 0, "n should be large than 0"
        return sorted(self.mapping.items(), key=lambda item: item[1], reverse=True)[:n]

if __name__ == '__main__':
    most_common_5 = Counter("importthis.txt").most_common(5)
    for item in most_common_5:
        print(item)

Execution effect:

('is', 10)
('better', 8)
('than', 8)
('the', 6)
('to', 5)

More python tutorials, recommended learning: Python video tutorial

The above is the detailed content of python counts word occurrences. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn