Home >Backend Development >Python Tutorial >How to Find the Most Common Value in a Pandas DataFrame After Grouping?
To cleanse data with multiple string columns, group by the first two columns and select the most common value for the third column in each combination.
The provided code fails with a KeyError, and grouping only by the City column results in an AssertionError. A robust solution is required.
Post pandas v0.16, pd.Series.mode offers a versatile and efficient method for this task:
source.groupby(['Country', 'City'])['Short name'].agg(pd.Series.mode)
In the case of multiple modes within a group, Series.mode returns a list of values. For a single result, apply a lambda function:
source.groupby(['Country', 'City'])['Short name'].agg(lambda x: pd.Series.mode(x)[0])
scipy.stats.mode can also be used, but it raises an error when encountering multiple modes.
The above is the detailed content of How to Find the Most Common Value in a Pandas DataFrame After Grouping?. For more information, please follow other related articles on the PHP Chinese website!