Home >Backend Development >Python Tutorial >How to Efficiently Extract Subarrays with a Given Stride from a NumPy Array?
Taking Subarrays from Numpy Array with Given Stride/Stepsize Efficiently
In the world of data analysis, often we need to extract subarrays with specific strides or stepsizes from a larger array. Numpy, the popular Python library for numerical operations, offers several methods to achieve this efficiently.
Problem Statement:
Given a Numpy array, we want to extract a matrix of subarrays of a fixed length with a specific stride or stepsize. A stride is the distance between the start of consecutive subarrays.
Discussion:
One straightforward way to create subarrays is to iterate over the original array using a for-loop. While this approach works, it can be slow for large arrays.
Approach 1: Broadcasting
NumPy's broadcasting mechanism allows us to create subarrays without loops. We can use the following function that takes the array, subarray length (L), and stride (S):
def broadcasting_app(a, L, S): nrows = ((a.size - L) // S) + 1 return a[S * np.arange(nrows)[:, None] + np.arange(L)]
Explanation:
np.arange(nrows) creates an array of indices with a stride of 1. By multiplying this with S, we get the starting indices of each subarray. We then broadcast these indices across the rows of a to obtain the subarrays.
Approach 2: NumPy Strides
Another efficient method uses NumPy's strides feature. Strides represent the number of bytes between consecutive elements along each axis. We can use this information to create subarrays:
def strided_app(a, L, S): nrows = ((a.size - L) // S) + 1 n = a.strides[0] return np.lib.stride_tricks.as_strided(a, shape=(nrows, L), strides=(S * n, n))
Explanation:
We use np.lib.stride_tricks.as_strided to reshape a by taking advantage of its strides. The resulting array has the desired number of rows (nrows) and subarray length (L), while maintaining the stride of S.
Sample Code:
To illustrate the approaches:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) print(broadcasting_app(a, L=5, S=3)) print(strided_app(a, L=5, S=3))
Output:
[[ 1 2 3 4 5] [ 4 5 6 7 8] [ 7 8 9 10 11]] [[ 1 2 3 4 5] [ 4 5 6 7 8] [ 7 8 9 10 11]]
Both approaches efficiently generate the matrix of subarrays with the desired stride.
The above is the detailed content of How to Efficiently Extract Subarrays with a Given Stride from a NumPy Array?. For more information, please follow other related articles on the PHP Chinese website!