Home >Backend Development >Python Tutorial >How to Create a New Race Label Column in Pandas Based on Multiple Existing Columns?

How to Create a New Race Label Column in Pandas Based on Multiple Existing Columns?

Susan Sarandon
Susan SarandonOriginal
2024-12-18 20:27:10661browse

How to Create a New Race Label Column in Pandas Based on Multiple Existing Columns?

Create a New Column Based on Values from Multiple Columns in Pandas

To create a new column in a Pandas dataframe based on values from multiple other columns, we can leverage the apply() function. This function allows us to apply a custom function to each row of the dataframe.

In this case, we want to create a new column with race labels based on the following criteria:

Race Label Criteria:

  • If the ERI_Hispanic column is 1, the label is "Hispanic."
  • Else if the sum of the remaining ERI columns is greater than 1, the label is "Two or More."
  • Else if the ERI_AmerInd_AKNatv column is 1, the label is "A/I AK Native."
  • Else if the ERI_Asian column is 1, the label is "Asian."
  • Else if the ERI_Black_Afr.Amer column is 1, the label is "Black/AA."
  • Else if the ERI_HI_PacIsl column is 1, the label is "Haw/Pac Isl."
  • Else if the ERI_White column is 1, the label is "White."

Custom Function for Race Labeling:

To define the custom function for race labeling, we can use the following code:

def label_race(row):
   if row['ERI_Hispanic'] == 1:
      return 'Hispanic'
   if row['ERI_AmerInd_AKNatv'] + row['ERI_Asian'] + row['ERI_Black_Afr.Amer'] + row['ERI_HI_PacIsl'] + row['ERI_White'] > 1:
      return 'Two Or More'
   if row['ERI_AmerInd_AKNatv'] == 1:
      return 'A/I AK Native'
   if row['ERI_Asian'] == 1:
      return 'Asian'
   if row['ERI_Black_Afr.Amer'] == 1:
      return 'Black/AA'
   if row['ERI_HI_PacIsl'] == 1:
      return 'Haw/Pac Isl.'
   if row['ERI_White'] == 1:
      return 'White'
   return 'Other'

Applying the Custom Function with apply():

To apply the label_race function to each row of the dataframe, we can use the apply() function with the axis=1 argument, which specifies that the function should be applied to each row:

df['race_label'] = df.apply(label_race, axis=1)

This will create a new column named race_label in the dataframe with the appropriate race labels.

The above is the detailed content of How to Create a New Race Label Column in Pandas Based on Multiple Existing Columns?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn