Home >Backend Development >Python Tutorial >How to Sum Specific DataFrame Rows in Pandas?
How to Sum DataFrame Rows for Specific Columns in Pandas
For a given DataFrame, it can be necessary to calculate the sum of values across specific rows. While attempting to achieve this through df[['a', 'b', 'd']].map(sum), you may encounter issues.
The appropriate operation for this task involves using sum() with axis=1. This operation calculates the sum of each row, ignoring non-numeric columns. To ensure accuracy, it's recommended to specify numeric_only=True, especially in Pandas versions 2.0 and above.
For example, consider a DataFrame with columns 'a', 'b', 'c', and 'd', where 'c' is a non-numeric column:
df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4], 'c': ['dd', 'ee', 'ff'], 'd': [5, 9, 1]})
To calculate the sum of columns 'a', 'b', and 'd', we can use:
df['e'] = df.sum(axis=1, numeric_only=True)
This will add a column 'e' containing the sum of the desired columns.
If you wish to sum specific columns while excluding others, you can specify a list of columns and remove any unwanted ones using col_list.remove(column_name).
col_list = list(df) col_list.remove('d') df['e'] = df[col_list].sum(axis=1)
This will create a new column 'e' with the sum of the specified columns.
The above is the detailed content of How to Sum Specific DataFrame Rows in Pandas?. For more information, please follow other related articles on the PHP Chinese website!