Home  >  Article  >  Backend Development  >  How to Convert Categorical Data to Numerical Indices in Pandas?

How to Convert Categorical Data to Numerical Indices in Pandas?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-10-28 11:00:30784browse

How to Convert Categorical Data to Numerical Indices in Pandas?

Pandas: Convert Categories to Numerical Indices

In Pandas, you can encounter situations where you need to convert categorical data, such as countries, into numerical indices. While pd.get_dummies can convert categories into one-hot encodings, it may not always be the most efficient solution. Here's a step-by-step guide on how to convert categories to numerical indices:

Step 1: Categorize the Column

First, change the type of the column to categorical:

<code class="python">df.cc = pd.Categorical(df.cc)</code>

This converts the countries column into a categorical column, denoted by pd.Categorical(column_name).

Step 2: Create a New Column for Codes

Next, create a new column to store the numerical indices:

<code class="python">df['code'] = df.cc.codes</code>

The codes attribute of the categorical column assigns each category a unique integer index.

Example:

Consider the following DataFrame:

   cc  temp
0  US  37.0
1  CA  12.0
2  US  35.0
3  AU  20.0

After following the steps above, you will have a new DataFrame:

   cc  temp  code
0  US  37.0     2
1  CA  12.0     1
2  US  35.0     2
3  AU  20.0     0

Additional Options:

  • Get Codes Without Modifying DataFrame: df.cc.astype('category').codes
  • Create Categorical Column as Index: df2 = pd.DataFrame(df.temp); df2.index = pd.CategoricalIndex(df.cc)

The above is the detailed content of How to Convert Categorical Data to Numerical Indices in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn