Home >Backend Development >Python Tutorial >How to One-Hot Encode Index Arrays in NumPy?

How to One-Hot Encode Index Arrays in NumPy?

Linda HamiltonOriginal: 2024-10-30 22:50:03780browse

One-Hot Encoding of Index Arrays in NumPy

Given an array of indices, converting it into a one-hot encoded array can be a useful technique for various machine learning applications. One-hot encoding represents each index as a binary vector, where the index's corresponding element is 1 and all others are 0. This technique is particularly valuable when dealing with categorical data or in situations where the indices serve as feature values.

To achieve one-hot encoding in NumPy, we follow a simple process:

Create a zero-initialized array with enough columns, where the number of columns is equal to the maximum value of the index array plus one.
For each row in the resulting array, set the column corresponding to the index at that row to 1.

Consider the example provided:

<code class="python">a = np.array([1, 0, 3])
b = np.zeros((a.size, a.max() + 1))
b[np.arange(a.size), a] = 1</code>

In this example, the index array a has values ranging from 0 to 3, so we create a zero-filled array b with 4 columns. We then use the np.arange() function to generate an array of row indices for b and set the appropriate columns to 1 based on the values in a.

The resulting array b is now a one-hot encoded representation of the original index array a:

array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])

This one-hot encoded array preserves the categorical nature of the index values and allows for efficient processing in machine learning algorithms.

The above is the detailed content of How to One-Hot Encode Index Arrays in NumPy?. For more information, please follow other related articles on the PHP Chinese website!

numpy Array for number function this column

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：Threading vs. Multiprocessing: When Should You Use Each in Python?Next article：Threading vs. Multiprocessing: When Should You Use Each in Python?

See more

How to One-Hot Encode Index Arrays in NumPy?

Related articles