Home  >  Article  >  Backend Development  >  How to Convert Pandas Categorical Columns to Numerical Indices Without `get_dummies` and `numpy`?

How to Convert Pandas Categorical Columns to Numerical Indices Without `get_dummies` and `numpy`?

Susan Sarandon
Susan SarandonOriginal
2024-10-27 22:51:02420browse

How to Convert Pandas Categorical Columns to Numerical Indices Without `get_dummies` and `numpy`?

Convert Pandas Categories to Numbers

Consider a DataFrame with a categorical column, such as country codes:

cc | temp
US | 37.0
CA | 12.0
US | 35.0
AU | 20.0

To convert these categories to indices, avoiding the use of get_dummies and numpy, consider the following steps:

  1. Categorize the Column: Convert the categorical column to a categorical type:
df.cc = pd.Categorical(df.cc)
  1. Retrieve Category Codes: Use the .codes attribute to retrieve the integer codes for each category:
df['code'] = df.cc.codes

The resulting DataFrame will include a new column called code with the numerical indices:

   cc  temp  code
0  US  37.0     2
1  CA  12.0     1
2  US  35.0     2
3  AU  20.0     0

Alternatively, you can obtain the category codes without modifying the DataFrame:

df.cc.astype('category').codes
  1. Use as Index: If desired, you can use the categorical column as an index for another DataFrame:
df2 = pd.DataFrame(df.temp)
df2.index = pd.CategoricalIndex(df.cc)

The above is the detailed content of How to Convert Pandas Categorical Columns to Numerical Indices Without `get_dummies` and `numpy`?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn