Home >Backend Development >Python Tutorial >How to efficiently convert categorical data to numerical indices in Pandas?
Pandas: Conversion of Categories to Numerical Indices
To convert categories in a Pandas dataframe to numerical indices, we can follow the efficient method provided by a user:
Step 1: Categorize the Column
Firstly, convert the target column (in this case, cc) to a categorical type:
<code class="python">df.cc = pd.Categorical(df.cc)</code>
Step 2: Capture Category Codes
Create a new column named code to store the category codes:
<code class="python">df['code'] = df.cc.codes</code>
Result:
The dataframe now includes a code column with indices corresponding to the categories:
cc | temp | code |
---|---|---|
US | 37.0 | 2 |
CA | 12.0 | 1 |
US | 35.0 | 2 |
AU | 20.0 | 0 |
Additional Options:
<code class="python">df.cc.astype('category').codes</code>
<code class="python">df2 = pd.DataFrame(df.temp) df2.index = pd.CategoricalIndex(df.cc)</code>
The above is the detailed content of How to efficiently convert categorical data to numerical indices in Pandas?. For more information, please follow other related articles on the PHP Chinese website!