Home  >  Article  >  Backend Development  >  How to Count Word Frequency and Sort by Frequency in Python?

How to Count Word Frequency and Sort by Frequency in Python?

Barbara Streisand
Barbara StreisandOriginal
2024-10-21 21:39:03830browse

How to Count Word Frequency and Sort by Frequency in Python?

Counting Word Frequency and Sorting by Frequency

When working with large datasets containing text data, it's often necessary to analyze the frequency of individual words. This information can be used for various natural language processing (NLP) tasks. In Python, this task can be simplified using a powerful tool called Counter.

Implementing the Design

Your design outlines the following steps:

  1. Create an empty list to store unique words (newlst).
  2. Create an empty list to store corresponding word frequencies (frequency).
  3. Iterate through the original list of words.
  4. For each word, check if it's already in newlst.
  5. If the word is not in newlst, add it and set the frequency to 1.
  6. If the word is already in newlst, increment its frequency.
  7. Sort newlst based on the frequency list.

Using Counter in Python

Python's collections module provides a specialized class called Counter, which is designed for counting and aggregating elements in iterables. Counter allows us to perform steps 3-6 in a single line of code. Here's how you can implement your design using Counter:

<code class="python">from collections import Counter

# Create a Counter from the list of words
counts = Counter(original_list)

# Sort the keys (unique words) based on their frequencies
sorted_words = sorted(counts.keys(), key=lambda x: counts[x], reverse=True)</code>

This code generates a sorted list of unique words, where the word with the highest frequency appears first.

Example

<code class="python">list1 = ['the', 'car', 'apple', 'banana', 'car', 'apple']
counts = Counter(list1)
print(counts)  # Counter({'apple': 2, 'car': 2, 'banana': 1, 'the': 1})
sorted_words = sorted(counts.keys(), key=lambda x: counts[x], reverse=True)
print(sorted_words)  # ['apple', 'car', 'banana', 'the']</code>

The above is the detailed content of How to Count Word Frequency and Sort by Frequency in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn