Home  >  Article  >  Backend Development  >  Detailed explanation of how Python uses Pandas for data analysis

Detailed explanation of how Python uses Pandas for data analysis

WBOY
WBOYforward
2022-09-06 17:30:013746browse

[Related recommendations: Python3 video tutorial]

Pandas is the most popular for data analysis Python library. It provides highly optimized performance, with backend source code written entirely in C or Python.

We can analyze the data in pandas in the following ways:

  • 1.Series

  • 2.Data Frame

Series

Series is a one-dimensional (1-D) array defined in pandas and can be used to store any data type.

Code#1

Create Series

# 创建 Series 的程序

# 导入 Panda 库
import pandas as pd

# 使用数据和索引创建 Series
a = pd.Series(Data, index = Index)

Here, data can be:

  • A scalar value , which can be an integerValue or a string
  • can be a Python dictionary of key-value pairs
  • a Ndarray

Note: By default, the index starts from 0, 1, 2, ...(n-1), where n is the data length.

Code#2

When Data contains a scalar value

# 使用标量值创建 Series 的程序

# 数值数据
Data =[1, 3, 4, 5, 6, 2, 9]

# 使用默认索引值创建系列
s = pd.Series(Data)	

# 预定义的索引值
Index =['a', 'b', 'c', 'd', 'e', 'f', 'g']

# 创建具有预定义索引值的系列
si = pd.Series(Data, Index)

Output:

Scalar data with default index

Scalar data with index

Code #3

When the data contains a dictionary

# 创建词典 Series 程序
dictionary ={'a':1, 'b':2, 'c':3, 'd':4, 'e':5}

# 创建字典类型 Series
sd = pd.Series(dictionary)

Output:

Dictionary type data

Code #4

When Data contains Ndarray

# 创建 ndarray series 的程序

# 定义二维数组
Data =[[2, 3, 4], [5, 6, 7]]

# 创建一系列二维数组
snd = pd.Series(Data)

Output:

Data as Ndarray

Data Frame

DataFrames is a two-dimensional (2-D) data structure defined in pandas, consisting of rows and columns.

Code#1

Create DataFrame

# 创建 DataFrame 的程序

# 导入库
import pandas as pd

# 使用数据创建 DataFrame
a = pd.DataFrame(Data)

Here, the data can be:

  • One or more This Dictionary
  • One or more Series
  • 2D-numpy Ndarray

Code #2

When the data is a dictionary

# 使用两个字典创建数据框的程序

# 定义字典 1
dict1 ={'a':1, 'b':2, 'c':3, 'd':4}

# 定义字典 2
dict2 ={'a':5, 'b':6, 'c':7, 'd':8, 'e':9}

# 用 dict1 和 dict2 定义数据
Data = {'first':dict1, 'second':dict2}

# 创建数据框
df = pd.DataFrame(Data)

Output:

DataFrame with two dictionaries

Code#3

When the data is a Series

# 创建三个系列的Dataframe的程序
import pandas as pd

# 定义 series 1
s1 = pd.Series([1, 3, 4, 5, 6, 2, 9])

# 定义 series 2
s2 = pd.Series([1.1, 3.5, 4.7, 5.8, 2.9, 9.3])

# 定义 series 3
s3 = pd.Series(['a', 'b', 'c', 'd', 'e'])	

# 定义 Data
Data ={'first':s1, 'second':s2, 'third':s3}

# 创建 DataFrame
dfseries = pd.DataFrame(Data)

Output:

DataFrame of three Series

Code#4

When Data is 2D-numpy ndarrayNote : One constraint must be maintained when creating a DataFrame of 2D arrays - the dimensions of the 2D arrays must be the same.

# 从二维数组创建 DataFrame 的程序

# 导入库
import pandas as pd

# 定义 2d 数组 1
d1 =[[2, 3, 4], [5, 6, 7]]

# 定义 2d 数组 2
d2 =[[2, 4, 8], [1, 3, 9]]

# 定义 Data
Data ={'first': d1, 'second': d2}

# 创建 DataFrame
df2d = pd.DataFrame(Data)

Output:

DataFrame with 2d ndarray

[Related recommendations:Python3 Video tutorial

The above is the detailed content of Detailed explanation of how Python uses Pandas for data analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:jb51.net. If there is any infringement, please contact admin@php.cn delete