Home >Backend Development >Python Tutorial >How to retrieve the name of the column with the maximum value for each row in a Pandas DataFrame?

How to retrieve the name of the column with the maximum value for each row in a Pandas DataFrame?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-11-28 12:41:12713browse

How to retrieve the name of the column with the maximum value for each row in a Pandas DataFrame?

Retrieving the Maximum Value Column Name for Each Row

In a DataFrame consisting of various columns and rows, a common task is to identify the column with the maximum value for each row. Consider the following DataFrame:

Communications and Search   Business    General Lifestyle<br>0   0.745763    0.050847    0.118644    0.084746<br>0   0.333333    0.000000    0.583333    0.083333<br>0   0.617021    0.042553    0.297872    0.042553<br>0   0.435897    0.000000    0.410256    0.153846<br>0   0.358974    0.076923    0.410256    0.153846<br>

Our goal is to create a new column, labeled as 'Max', which contains the column name associated with the maximum value in each row. The desired output resembles the following:

Communications and Search   Business    General Lifestyle  Max<br>0   0.745763    0.050847    0.118644    0.084746           Communications <br>0   0.333333    0.000000    0.583333    0.083333           Business  <br>0   0.617021    0.042553    0.297872    0.042553           Communications <br>0   0.435897    0.000000    0.410256    0.153846           Communications <br>0   0.358974    0.076923    0.410256    0.153846           Business <br>

To accomplish this, we can employ the idxmax function:

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({
    'Communications and Search': [0.745763, 0.333333, 0.617021, 0.435897, 0.358974],
    'Business': [0.050847, 0.000000, 0.042553, 0.000000, 0.076923],
    'General': [0.118644, 0.583333, 0.297872, 0.410256, 0.410256],
    'Lifestyle': [0.084746, 0.083333, 0.042553, 0.153846, 0.153846]
})

# Find the column index with the maximum value in each row
max_column_idxs = df.idxmax(axis=1)

# Create a new column with the column names
df['Max'] = max_column_idxs

# Display the updated DataFrame
print(df)

By utilizing the idxmax function with the axis parameter set to 1, we determine the column index with the maximum value for each row. This information is then used to create a new column named 'Max', which identifies the corresponding column name for each row's maximum value. The resultant DataFrame exhibits the requested format.

The above is the detailed content of How to retrieve the name of the column with the maximum value for each row in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn