Home >Backend Development >Python Tutorial >How to Create a New Race Label Column in Pandas Based on Multiple Existing Columns?
Create a New Column Based on Values from Multiple Columns in Pandas
To create a new column in a Pandas dataframe based on values from multiple other columns, we can leverage the apply() function. This function allows us to apply a custom function to each row of the dataframe.
In this case, we want to create a new column with race labels based on the following criteria:
Race Label Criteria:
Custom Function for Race Labeling:
To define the custom function for race labeling, we can use the following code:
def label_race(row): if row['ERI_Hispanic'] == 1: return 'Hispanic' if row['ERI_AmerInd_AKNatv'] + row['ERI_Asian'] + row['ERI_Black_Afr.Amer'] + row['ERI_HI_PacIsl'] + row['ERI_White'] > 1: return 'Two Or More' if row['ERI_AmerInd_AKNatv'] == 1: return 'A/I AK Native' if row['ERI_Asian'] == 1: return 'Asian' if row['ERI_Black_Afr.Amer'] == 1: return 'Black/AA' if row['ERI_HI_PacIsl'] == 1: return 'Haw/Pac Isl.' if row['ERI_White'] == 1: return 'White' return 'Other'
Applying the Custom Function with apply():
To apply the label_race function to each row of the dataframe, we can use the apply() function with the axis=1 argument, which specifies that the function should be applied to each row:
df['race_label'] = df.apply(label_race, axis=1)
This will create a new column named race_label in the dataframe with the appropriate race labels.
The above is the detailed content of How to Create a New Race Label Column in Pandas Based on Multiple Existing Columns?. For more information, please follow other related articles on the PHP Chinese website!