


Detailed explanation of the idea of counting word count in Python
This article mainly introduces the detailed idea of counting word count in Python. The article also provides you with a solution without using third-party modules. Friends who are interested should take a look together
Problem description:
Use Python to implement the function count_words(). This function inputs the string s and the number n, and returns the n words with the highest frequency in s. The return value is a list of tuples, including the n words with the highest occurrences and their times, that is, [(
You can assume that all input is lowercase and contains no punctuation or other characters (only letters and a single space). If there are the same number of occurrences, they are arranged in alphabetical order.
For example:
print count_words("betty bought a bit of butter but the butter was bitter",3)
Output:
[('butter', 2), ('a ', 1), ('betty', 1)]
Ideas to solve the problem:
1. s performs whitespace splitting to obtain all word lists split_s, such as: ['betty', 'bought', 'a', 'bit', 'of', 'butter', 'but', 'the', 'butter' , 'was', 'bitter']
2. Create a maplist and convert split_s into a list whose elements are tuples, such as: [('betty', 1), ('bought', 1) , ('a', 1), ('bit', 1), ('of', 1), ('butter', 1), ('but', 1), ('the', 1), ( 'butter', 1), ('was', 1), ('bitter', 1)]
3. Merge the elements in the maplist. If the first index value of the tuple is the same, then the first index value of the tuple will be the same. The two index values are added.
// Note: Prepare to use defaultdict. The obtained data is as follows: {'betty': 1, 'bought': 1, 'a': 1, 'bit': 1, 'of': 1, 'butter': 2, 'but': 1, 'the ': 1, 'was': 1, 'bitter': 1}
4. Sort alphabetically by key and get the following: [('a', 1), ('betty', 1), ('bit', 1), ('bitter', 1), ('bought', 1), ('but', 1), ('butter', 2), ('of', 1) , ('the', 1), ('was', 1)]
5. Perform secondary sorting, sort by value, and get the following: [('butter', 2), ('a ', 1), ('betty', 1), ('bit', 1), ('bitter', 1), ('bought', 1), ('but', 1), ('of', 1), ('the', 1), ('was', 1)]
6. Use slicing to extract a set of data with higher frequency
Summary: Not available on python3 The sorting results of defaultdict are also correct, but not correct on python2. defaultdict itself has no order. To distinguish the list, it must be sorted.
You can also try to write it yourself without using third-party modules
Solution 1 (use defaultdict):
from collections import defaultdict """Count words.""" def count_words(s, n): """Return the n most frequently occuring words in s.""" split_s = s.split() map_list = [(k,1) for k in split_s] output = defaultdict(int) for d in map_list: output[d[0]] += d[1] output1 = dict(output) top_n = sorted(output1.items(), key=lambda pair:pair[0], reverse=False) top_n = sorted(top_n, key=lambda pair:pair[1], reverse=True) return top_n[:n] def test_run(): """Test count_words() with some inputs.""" print(count_words("cat bat mat cat bat cat", 3)) print(count_words("betty bought a bit of butter but the butter was bitter", 4)) if __name__ == '__main__': test_run()
Solution 2 (Use Counter)
from collections import Counter """Count words.""" def count_words(s, n): """Return the n most frequently occuring words in s.""" split_s = s.split() split_s = Counter(name for name in split_s) print(split_s) top_n = sorted(split_s.items(), key=lambda pair:pair[0], reverse=False) print(top_n) top_n = sorted(top_n, key=lambda pair:pair[1], reverse=True) print(top_n) return top_n[:n] def test_run(): """Test count_words() with some inputs.""" print(count_words("cat bat mat cat bat cat", 3)) print(count_words("betty bought a bit of butter but the butter was bitter", 4)) if __name__ == '__main__': test_run()
Related recommendations:
Python implements the calculation of the value of pi to any value Bit method example
The above is the detailed content of Detailed explanation of the idea of counting word count in Python. For more information, please follow other related articles on the PHP Chinese website!

Pythonarrayssupportvariousoperations:1)Slicingextractssubsets,2)Appending/Extendingaddselements,3)Insertingplaceselementsatspecificpositions,4)Removingdeleteselements,5)Sorting/Reversingchangesorder,and6)Listcomprehensionscreatenewlistsbasedonexistin

NumPyarraysareessentialforapplicationsrequiringefficientnumericalcomputationsanddatamanipulation.Theyarecrucialindatascience,machinelearning,physics,engineering,andfinanceduetotheirabilitytohandlelarge-scaledataefficiently.Forexample,infinancialanaly

Useanarray.arrayoveralistinPythonwhendealingwithhomogeneousdata,performance-criticalcode,orinterfacingwithCcode.1)HomogeneousData:Arrayssavememorywithtypedelements.2)Performance-CriticalCode:Arraysofferbetterperformancefornumericaloperations.3)Interf

No,notalllistoperationsaresupportedbyarrays,andviceversa.1)Arraysdonotsupportdynamicoperationslikeappendorinsertwithoutresizing,whichimpactsperformance.2)Listsdonotguaranteeconstanttimecomplexityfordirectaccesslikearraysdo.

ToaccesselementsinaPythonlist,useindexing,negativeindexing,slicing,oriteration.1)Indexingstartsat0.2)Negativeindexingaccessesfromtheend.3)Slicingextractsportions.4)Iterationusesforloopsorenumerate.AlwayschecklistlengthtoavoidIndexError.

ArraysinPython,especiallyviaNumPy,arecrucialinscientificcomputingfortheirefficiencyandversatility.1)Theyareusedfornumericaloperations,dataanalysis,andmachinelearning.2)NumPy'simplementationinCensuresfasteroperationsthanPythonlists.3)Arraysenablequick

You can manage different Python versions by using pyenv, venv and Anaconda. 1) Use pyenv to manage multiple Python versions: install pyenv, set global and local versions. 2) Use venv to create a virtual environment to isolate project dependencies. 3) Use Anaconda to manage Python versions in your data science project. 4) Keep the system Python for system-level tasks. Through these tools and strategies, you can effectively manage different versions of Python to ensure the smooth running of the project.

NumPyarrayshaveseveraladvantagesoverstandardPythonarrays:1)TheyaremuchfasterduetoC-basedimplementation,2)Theyaremorememory-efficient,especiallywithlargedatasets,and3)Theyofferoptimized,vectorizedfunctionsformathematicalandstatisticaloperations,making


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 Chinese version
Chinese version, very easy to use

Notepad++7.3.1
Easy-to-use and free code editor
