Home >Backend Development >Python Tutorial >How to efficiently convert categorical data to numerical indices in Pandas?

How to efficiently convert categorical data to numerical indices in Pandas?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-10-29 04:43:02503browse

How to efficiently convert categorical data to numerical indices in Pandas?

Pandas: Conversion of Categories to Numerical Indices

To convert categories in a Pandas dataframe to numerical indices, we can follow the efficient method provided by a user:

Step 1: Categorize the Column
Firstly, convert the target column (in this case, cc) to a categorical type:

<code class="python">df.cc = pd.Categorical(df.cc)</code>

Step 2: Capture Category Codes
Create a new column named code to store the category codes:

<code class="python">df['code'] = df.cc.codes</code>

Result:

The dataframe now includes a code column with indices corresponding to the categories:

cc temp code
US 37.0 2
CA 12.0 1
US 35.0 2
AU 20.0 0

Additional Options:

  • To retrieve the codes without modifying the DataFrame:
<code class="python">df.cc.astype('category').codes</code>
  • To use the categorical column as an index:
<code class="python">df2 = pd.DataFrame(df.temp)
df2.index = pd.CategoricalIndex(df.cc)</code>

The above is the detailed content of How to efficiently convert categorical data to numerical indices in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn