Home  >  Article  >  Backend Development  >  A brief analysis of Python data processing

A brief analysis of Python data processing

不言
不言Original
2018-05-02 13:46:241740browse

This article shares with you the relevant content and key explanations about Python data processing. Friends who are interested in this knowledge point can refer to it.

Numpy and Pandas are two frameworks often used in Python data processing. They are both written in C language, so the operation speed is fast. Matplotlib is a Python drawing tool that can draw previously processed data through images. I have only seen the syntax before and have not systematically studied and summarized it. This blog post summarizes the APIs of these three frameworks.

The following is a brief introduction and difference between these three frameworks:

  • Numpy: often used for data generation and some operations

  • Pandas: Built based on Numpy, it is an upgraded version of Numpy

  • Matplotlib: A powerful drawing tool in Python

Numpy

Numpy quick start tutorial can refer to: Numpy tutorial

Numpy properties

ndarray.ndim: Dimension

ndarray.shape: Number of rows and columns, such as (3, 5)

ndarray.size: Number of elements

ndarray. dtype: element type

Numpy creation

array(object, dtype=None): Use Python’s list or tuple to create data

zeors(shape, dtype=float): Create data that is all 0

ones(shape, dtype=None): Create data that is all 1

empty( shape, dtype=float): Create data without initialization

arange([start, ]stop, [step, ]dtype=None): Create fixed-interval data segments

linspace(start, stop, num=50, dtype=None): Create data evenly within a given range

Numpy operation

Add, Subtract: a b, a - b

Multiply: b*2, 10*np.sin(a)

raised to the power: b**2

Judgment: a93a319bb29a8ffd7e82e258962636a0e 0]

Pandas handles missing data

Delete rows with missing data: df.dropna(how='any')

Fill in missing data :df.fillna(value=5)

Whether the data value is NaN: pd.isna(df1)

Pandas merged data

pd.concat([df1, df2, df3], axis=0): merge df

pd.merge(left, right, on='key'): merge based on key field

df.append(s, ignore_index=True):Add data

Pandas import and export

df.to_csv('foo.csv' ): Save to csv file

pd.read_csv('foo.csv'): Read from csv file

df.to_excel('foo.xlsx', sheet_name='Sheet1'): Save to excel file

pd.read_excel('foo.xlsx', 'Sheet1', index_col=None, na_values=['NA']): From excel file Read

Matplotlib

Here we only introduce the simplest way to plot:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# 随机生成1000个数据
data = pd.Series(np.random.randn(1000),index=np.arange(1000))
# 为了方便观看效果, 我们累加这个数据
data.cumsum()
# pandas 数据可以直接观看其可视化形式
data.plot()
plt.show()

Related recommendations:

A brief discussion on the configuration file path problem of python log

The above is the detailed content of A brief analysis of Python data processing. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn