Home  >  Article  >  Backend Development  >  A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

WBOY
WBOYforward
2023-04-10 14:56:321818browse

Numpy Slicing and Indexing

The contents of an ndarray object can be accessed and modified through indexing or slicing, just like the slicing operation for lists in Python.

ndarray array can be indexed based on the subscript from 0 ~ n-1. The slice object can be obtained from the original array through the built-in slice function and setting the start, stop and step parameters. Cut out a new array.

A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

## Slices can also include ellipses..., to Make the length of the selection tuple the same as the dimensions of the array. If an ellipsis is used at row position, it will return an ndarray containing the elements in the row.

A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

Advanced index

Integer array index

The following example obtains the array # Elements at positions ##(0,0), (1,1) and (2,0).

A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

a = np.array([[0,1,2], [3,4,5], [6,7,8], [9,10,11]])
print(a)
print('-' * 20)

rows = np.array([[0,0], [3,3]])
cols = np.array([[0,2], [0,2]])

b = a[rows, cols]
print(b)
print('-' * 20)

rows = np.array([[0,1], [2,3]])
cols = np.array([[0,2], [0,2]])
c = a[rows, cols]
print(c)
print('-' * 20)

rows = np.array([[0,1,2], [1,2,3], [1,2,3]])
cols = np.array([[0,1,2], [0,1,2], [0,1,2]])
d = a[rows, cols]
print(d)
[[ 012]
 [ 345]
 [ 678]
 [ 9 10 11]]
--------------------
[[ 02]
 [ 9 11]]
--------------------
[[ 05]
 [ 6 11]]
--------------------
[[ 048]
 [ 37 11]
 [ 37 11]]

The result returned is an ndarray object containing each corner element.

Can be combined with an index array using slicing: or .... As in the following example:

a = np.array([[1,2,3], [4,5,6], [7,8,9]])

print(a)
print('-' * 20)

b = a[1:3, 1:3]
print(b)
print('-' * 20)

c = a[1:3, [0,2]]
print(c)
print('-' * 20)

d = a[..., 1:]
print(d)
[[1 2 3]
 [4 5 6]
 [7 8 9]]
--------------------
[[5 6]
 [8 9]]
--------------------
[[4 6]
 [7 9]]
--------------------
[[2 3]
 [5 6]
 [8 9]]

Boolean index

We can index the target array through a Boolean array.

Boolean index uses Boolean operations (such as comparison operators) to obtain an array of elements that meet specified conditions.

The following example obtains elements greater than 5:

a = np.array([[1,2,3], [4,5,6], [7,8,9]])

print(a)
print('-' * 20)

print(a[a > 5])
[[1 2 3]
 [4 5 6]
 [7 8 9]]
--------------------
[6 7 8 9]

The following example uses ~ (complement operator) to filter NaN.

a = np.array([np.nan, 1, 2, np.nan, 3, 4, 5])

print(a)
print('-' * 20)

print(a[~np.isnan(a)])
[nan1.2. nan3.4.5.]
--------------------
[1. 2. 3. 4. 5.]

The following example demonstrates how to filter out non-plural elements from an array.

a = np.array([1, 3+4j, 5, 6+7j])

print(a)
print('-' * 20)

print(a[np.iscomplex(a)])
[1.+0.j 3.+4.j 5.+0.j 6.+7.j]
--------------------
[3.+4.j 6.+7.j]

Fancy index

Fancy index refers to using an integer array for indexing.

Fancy index takes the value based on the value of the index array as the subscript of an axis of the target array.

For using a one-dimensional integer array as an index, if the target is a one-dimensional array, then the index result is the element at the corresponding position. If the target is a two-dimensional array, then it is the row corresponding to the subscript.

Fancy indexing is different from slicing, it always copies the data into a new array.

One-dimensional array

a = np.arange(2, 10)

print(a)
print('-' * 20)

b = a[[0,6]]
print(b)
[2 3 4 5 6 7 8 9]
--------------------
[2 8]

Two-dimensional array

1 , pass in the sequential index array

a = np.arange(32).reshape(8, 4)

print(a)
print('-' * 20)

print(a[[4, 2, 1, 7]])
[[ 0123]
 [ 4567]
 [ 89 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]
 [24 25 26 27]
 [28 29 30 31]]
--------------------
[[16 17 18 19]
 [ 89 10 11]
 [ 4567]
 [28 29 30 31]]

2, pass in the reverse order index array

a = np.arange(32).reshape(8, 4)
print(a[[-4, -2, -1, -7]])
[[16 17 18 19]
 [24 25 26 27]
 [28 29 30 31]
 [ 4567]]

3. Pass in multiple index arrays (np.ix_ should be used)

np.ix_ function is to input two arrays and generate a Cartesian product mapping relationship.

The Cartesian product refers to the Cartesian product (Cartesian product) of two sets X and Y in mathematics, also known as the direct product, expressed as

X×Y, the first object is a member of X and the second object is one member of all possible ordered pairs of Y. For example, A={a,b}, B={0,1,2}, then:

A×B={(a, 0), (a, 1), (a, 2), (b, 0), (b, 1), (b, 2)}
B×A={(0, a), (0, b), (1, a), (1, b), (2, a), (2, b)}
a = np.arange(32).reshape(8, 4)
print(a[np.ix_([1,5,7,2], [0,3,1,2])])
[[ 4756]
 [20 23 21 22]
 [28 31 29 30]
 [ 8 119 10]]

Broadcast(Broadcast)

Broadcast is numpy's way of performing numerical calculations on arrays of different shapes. Arithmetic operations on arrays are usually performed on corresponding elements.

如果两个数组 a 和 b 形状相同,即满足 a.shape == b.shape,那么 a*b 的结果就是 a 与 b 数组对应位相乘。这要求维数相同,且各维度的长度相同。

a = np.arange(1, 5)
b = np.arange(1, 5)

c = a * b
print(c)
[ 149 16]

当运算中的 2 个数组的形状不同时,numpy 将自动触发广播机制。如:

a = np.array([
[0, 0, 0],
[10, 10, 10],
[20, 20, 20],
[30, 30, 30]
])

b = np.array([0, 1, 2])

print(a + b)
[[ 012]
 [10 11 12]
 [20 21 22]
 [30 31 32]]

下面的图片展示了数组 b 如何通过广播来与数组 a 兼容。

A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting

tile扩展数组

a = np.array([1, 2])

b = np.tile(a, (6, 1))
print(b)

print('-' * 20)

c = np.tile(a, (2, 3))
print(c)
[[1 2]
 [1 2]
 [1 2]
 [1 2]
 [1 2]
 [1 2]]
--------------------
[[1 2 1 2 1 2]
 [1 2 1 2 1 2]]

4x3 的二维数组与长为 3 的一维数组相加,等效于把数组 b 在二维上重复 4 次再运算:

a = np.array([
[0, 0, 0],
[10, 10, 10],
[20, 20, 20],
[30, 30, 30]
])

b = np.array([0, 1, 2])
bb = np.tile(b, (4, 1))

print(a + bb)
[[ 012]
 [10 11 12]
 [20 21 22]
 [30 31 32]]

广播的规则:

  • 让所有输入数组都向其中形状最长的数组看齐,形状中不足的部分都通过在前面加 1 维补齐。
  • 输出数组的形状是输入数组形状的各个维度上的最大值。
  • 如果输入数组的某个维度和输出数组的对应维度的长度相同或者其长度为 1 时,这个数组能够用来计算,否则出错。
  • 当输入数组的某个维度的长度为 1 时,沿着此维度运算时都用此维度上的第一组值。

简单理解:对两个数组,分别比较他们的每一个维度(若其中一个数组没有当前维度则忽略),满足:

  • 数组拥有相同形状。
  • 当前维度的值相等。
  • 当前维度的值有一个是 1。

若条件不满足,抛出 "ValueError: frames are not aligned" 异常。

The above is the detailed content of A detailed explanation of Python data analysis module Numpy slicing, indexing and broadcasting. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete