


This article mainly introduces the example code for implementing sparse matrix in python. The editor thinks it is quite good. Now I will share it with you and give it as a reference. Let’s follow the editor to take a look
In engineering practice, in most cases, large matrices are generally sparse matrices, so how to deal with sparse matrices is very important in practice. This article takes the implementation in Python as an example. First, let's explore how sparse matrices are stored and represented.
1. A preliminary study on the sparse module
In the scipy module in python, there is a module called the sparse module, which is specifically designed to solve sparse matrices. Most of the content of this article is actually based on the sparse module.
The first step is to import the sparse module
>>> from scipy import sparse
Then help, let’s take a look first
>>> help(sparse)
Find directly the part we are most concerned about:
Usage information ================= There are seven available sparse matrix types: 1. csc_matrix: Compressed Sparse Column format 2. csr_matrix: Compressed Sparse Row format 3. bsr_matrix: Block Sparse Row format 4. lil_matrix: List of Lists format 5. dok_matrix: Dictionary of Keys format 6. coo_matrix: COOrdinate format (aka IJV, triplet format) 7. dia_matrix: DIAgonal format To construct a matrix efficiently, use either dok_matrix or lil_matrix. The lil_matrix class supports basic slicing and fancy indexing with a similar syntax to NumPy arrays. As illustrated below, the COO format may also be used to efficiently construct matrices. To perform manipulations such as multiplication or inversion, first convert the matrix to either CSC or CSR format. The lil_matrix format is row-based, so conversion to CSR is efficient, whereas conversion to CSC is less so. All conversions among the CSR, CSC, and COO formats are efficient, linear-time operations.
Through this description, we have a general understanding of the sparse module. There are 7 ways to store sparse matrices in the sparse module. Next, we will introduce these 7 methods one by one.
2.coo_matrix
coo_matrix is the simplest storage method. Use three arrays row, col and data to store the information of non-zero elements. The three arrays have the same length, row holds the row of elements, col holds the column of elements, and data holds the value of the element. Generally speaking, coo_matrix is mainly used to create matrices, because coo_matrix cannot add, delete, or modify elements of the matrix. Once the matrix is successfully created, it will be converted into other forms of matrices.
>>> row = [2,2,3,2] >>> col = [3,4,2,3] >>> c = sparse.coo_matrix((data,(row,col)),shape=(5,6)) >>> print c.toarray() [[0 0 0 0 0 0] [0 0 0 0 0 0] [0 0 0 5 2 0] [0 0 3 0 0 0] [0 0 0 0 0 0]]
One thing to note is that when using coo_matrix to create a matrix, the same row and column coordinates can appear multiple times. After the matrix is actually created, the corresponding coordinate values will be added up to get the final result.
3.dok_matrix and lil_matrix
The scenario where dok_matrix and lil_matrix are applicable is to gradually add elements of the matrix. The strategy of doc_matrix is to use a dictionary to record the elements in the matrix that are not 0. Naturally, the key of the dictionary stores the ancestor of the position information of the recorded element, and the value is the specific value of the recorded element.
>>> import numpy as np >>> from scipy.sparse import dok_matrix >>> S = dok_matrix((5, 5), dtype=np.float32) >>> for i in range(5): ... for j in range(5): ... S[i, j] = i + j ... >>> print S.toarray() [[ 0. 1. 2. 3. 4.] [ 1. 2. 3. 4. 5.] [ 2. 3. 4. 5. 6.] [ 3. 4. 5. 6. 7.] [ 4. 5. 6. 7. 8.]]
lil_matrix uses two lists to store non-0 elements. data stores the non-zero elements in each row, and rows stores the columns in which the non-zero elements are located. This format is also great for adding elements one at a time and getting row-related data quickly.
>>> from scipy.sparse import lil_matrix >>> l = lil_matrix((6,5)) >>> l[2,3] = 1 >>> l[3,4] = 2 >>> l[3,2] = 3 >>> print l.toarray() [[ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 1. 0.] [ 0. 0. 3. 0. 2.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.]] >>> print l.data [[] [] [1.0] [3.0, 2.0] [] []] >>> print l.rows [[] [] [3] [2, 4] [] []]
It can be easily seen from the above analysis that the above two methods of constructing sparse matrices are generally used to construct matrices by gradually adding non-zero elements, and then convert them into other methods that can be quickly calculated. Matrix storage method.
4.dia_matrix
This is a diagonal storage method. Where columns represent diagonals and rows represent rows. If the elements on the diagonal are all 0, they are omitted.
If the original matrix is a diagonal matrix, the compression rate will be very high.
If I find a picture on the Internet, everyone can easily understand the principle.
5.csr_matrix and csc_matrix
csr_matrix, the full name is Compressed Sparse Row, is a row-based processing of matrices compressed. CSR requires three types of data: numerical values, column numbers, and row offsets. CSR is a coding method in which the meanings of numerical values and column numbers are consistent with those in coo. The row offset indicates the starting offset position of the first element of a row in values.
I also found a picture on the Internet that can better reflect the principle.
Let’s see how to use it in python: How about
>>> from scipy.sparse import csr_matrix >>> indptr = np.array([0, 2, 3, 6]) >>> indices = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
, is it not difficult to understand?
Let’s take a look at what the document says
Notes | ----- | | Sparse matrices can be used in arithmetic operations: they support | addition, subtraction, multiplication, pision, and matrix power. | | Advantages of the CSR format | - efficient arithmetic operations CSR + CSR, CSR * CSR, etc. | - efficient row slicing | - fast matrix vector products | | Disadvantages of the CSR format | - slow column slicing operations (consider CSC) | - changes to the sparsity structure are expensive (consider LIL or DOK)
It is not difficult to see that csr_matrix is more suitable for real matrix operations.
As for csc_matrix, it is similar to csr_matrix, but it is compressed based on columns and will not be introduced separately.
6.bsr_matrix
Block Sparse Row format, as the name suggests, compresses the matrix based on the idea of blocking.
The above is the detailed content of How to deal with sparse matrices? Python implementation of sparse matrix tutorial. For more information, please follow other related articles on the PHP Chinese website!

本篇文章给大家带来了关于Python的相关知识,其中主要介绍了关于Seaborn的相关问题,包括了数据可视化处理的散点图、折线图、条形图等等内容,下面一起来看一下,希望对大家有帮助。

本篇文章给大家带来了关于Python的相关知识,其中主要介绍了关于进程池与进程锁的相关问题,包括进程池的创建模块,进程池函数等等内容,下面一起来看一下,希望对大家有帮助。

本篇文章给大家带来了关于Python的相关知识,其中主要介绍了关于简历筛选的相关问题,包括了定义 ReadDoc 类用以读取 word 文件以及定义 search_word 函数用以筛选的相关内容,下面一起来看一下,希望对大家有帮助。

VS Code的确是一款非常热门、有强大用户基础的一款开发工具。本文给大家介绍一下10款高效、好用的插件,能够让原本单薄的VS Code如虎添翼,开发效率顿时提升到一个新的阶段。

pythn的中文意思是巨蟒、蟒蛇。1989年圣诞节期间,Guido van Rossum在家闲的没事干,为了跟朋友庆祝圣诞节,决定发明一种全新的脚本语言。他很喜欢一个肥皂剧叫Monty Python,所以便把这门语言叫做python。

本篇文章给大家带来了关于Python的相关知识,其中主要介绍了关于数据类型之字符串、数字的相关问题,下面一起来看一下,希望对大家有帮助。

本篇文章给大家带来了关于Python的相关知识,其中主要介绍了关于numpy模块的相关问题,Numpy是Numerical Python extensions的缩写,字面意思是Python数值计算扩展,下面一起来看一下,希望对大家有帮助。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Dreamweaver CS6
Visual web development tools

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment
