Home >Backend Development >Python Tutorial >How to Perform Value Counts and Find Maximum Counts for Multiple Columns Using Pandas DataFrame GroupBy?
Pandas DataFrame GroupBy Multiple Columns for Value Counts
In DataFrame manipulation with Pandas, grouping data by multiple columns can provide valuable insights. This article demonstrates how to count observations while grouping by two columns, as well as determine the highest count for each grouping.
Given a DataFrame with multiple columns, it is possible to apply the 'groupby' function to group data based on specific columns. Here, we have a DataFrame named 'df' with five columns: 'col1', 'col2', 'col3', 'col4', and 'col5'.
<code class="python">import pandas as pd df = pd.DataFrame([ [1.1, 1.1, 1.1, 2.6, 2.5, 3.4,2.6,2.6,3.4,3.4,2.6,1.1,1.1,3.3], list('AAABBBBABCBDDD'), [1.1, 1.7, 2.5, 2.6, 3.3, 3.8,4.0,4.2,4.3,4.5,4.6,4.7,4.7,4.8], ['x/y/z','x/y','x/y/z/n','x/u','x','x/u/v','x/y/z','x','x/u/v/b','-','x/y','x/y/z','x','x/u/v/w'], ['1','3','3','2','4','2','5','3','6','3','5','1','1','1'] ]).T df.columns = ['col1','col2','col3','col4','col5']</code>
Counting by Row Groups
To count the number of observations in each row group, use the 'groupby' function on the desired columns and then apply the 'size' function.
<code class="python">result = df.groupby(['col5', 'col2']).size()</code>
This will produce a DataFrame with the grouped columns as the index and the size as the values.
<code class="python">print(result)</code>
Determining the Highest Count
To determine the maximum count for each 'col2' value, use the 'groupby' function on 'col2' and then apply the 'max' function on the grouped data.
<code class="python">result = df.groupby(['col5', 'col2']).size().groupby(level=1).max()</code>
This will produce a Series with the maximum count for each 'col2' value.
<code class="python">print(result)</code>
In summary, using the 'groupby' and 'size' functions in Pandas allows for efficient analysis and aggregation of data, enabling users to extract insights about their data in various ways.
The above is the detailed content of How to Perform Value Counts and Find Maximum Counts for Multiple Columns Using Pandas DataFrame GroupBy?. For more information, please follow other related articles on the PHP Chinese website!