Home > Article > Backend Development > How to One-Hot Encode Index Arrays in NumPy?
One-Hot Encoding of Index Arrays in NumPy
Given an array of indices, converting it into a one-hot encoded array can be a useful technique for various machine learning applications. One-hot encoding represents each index as a binary vector, where the index's corresponding element is 1 and all others are 0. This technique is particularly valuable when dealing with categorical data or in situations where the indices serve as feature values.
To achieve one-hot encoding in NumPy, we follow a simple process:
Consider the example provided:
<code class="python">a = np.array([1, 0, 3]) b = np.zeros((a.size, a.max() + 1)) b[np.arange(a.size), a] = 1</code>
In this example, the index array a has values ranging from 0 to 3, so we create a zero-filled array b with 4 columns. We then use the np.arange() function to generate an array of row indices for b and set the appropriate columns to 1 based on the values in a.
The resulting array b is now a one-hot encoded representation of the original index array a:
array([[ 0., 1., 0., 0.], [ 1., 0., 0., 0.], [ 0., 0., 0., 1.]])
This one-hot encoded array preserves the categorical nature of the index values and allows for efficient processing in machine learning algorithms.
The above is the detailed content of How to One-Hot Encode Index Arrays in NumPy?. For more information, please follow other related articles on the PHP Chinese website!