Home >Backend Development >Python Tutorial >How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?
Performing Grouped Aggregation and Average Calculations
Consider the following DataFrame with data on cluster, organization, and time:
cluster org time 0 a 8 1 a 6 2 h 34 3 c 23 4 d 74 5 w 6
The objective is to calculate the average time per organization within each cluster. The expected result should resemble:
cluster mean(time) 1 15 #=((8 + 6) / 2 + 23) / 2 2 54 #=(74 + 34) / 2 3 6
Solution Using Double GroupBy and Mean Calculations:
To achieve this, utilize the power of Pandas' groupby function:
cluster_org_time = df.groupby(['cluster', 'org'], as_index=False).mean() result = cluster_org_time.groupby('cluster')['time'].mean()
Alternative Solution for Clustered Group Averages:
For the average of cluster groups only, simply group by ['cluster'] and compute the mean using mean().
cluster_mean_time = df.groupby(['cluster']).mean()
Additional Option for GroupBy with org and Mean Calculation:
Alternatively, you can group by ['cluster', 'org'] and directly calculate the mean:
cluster_org_mean_time = df.groupby(['cluster', 'org']).mean()
The above is the detailed content of How to Calculate the Average Time per Organization Within Each Cluster in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!