Home >Backend Development >Python Tutorial >Detailed explanation of how Python uses Pandas for data analysis
[Related recommendations: Python3 video tutorial]
Pandas is the most popular for data analysis Python library. It provides highly optimized performance, with backend source code written entirely in C or Python.
We can analyze the data in pandas in the following ways:
1.Series
2.Data Frame
Series is a one-dimensional (1-D) array defined in pandas and can be used to store any data type.
Create Series
# 创建 Series 的程序 # 导入 Panda 库 import pandas as pd # 使用数据和索引创建 Series a = pd.Series(Data, index = Index)
Here, data can be:
Note: By default, the index starts from 0, 1, 2, ...(n-1), where n is the data length.
When Data contains a scalar value
# 使用标量值创建 Series 的程序 # 数值数据 Data =[1, 3, 4, 5, 6, 2, 9] # 使用默认索引值创建系列 s = pd.Series(Data) # 预定义的索引值 Index =['a', 'b', 'c', 'd', 'e', 'f', 'g'] # 创建具有预定义索引值的系列 si = pd.Series(Data, Index)
Output:
Scalar data with default index
Scalar data with index
When the data contains a dictionary
# 创建词典 Series 程序 dictionary ={'a':1, 'b':2, 'c':3, 'd':4, 'e':5} # 创建字典类型 Series sd = pd.Series(dictionary)
Output:
Dictionary type data
When Data contains Ndarray
# 创建 ndarray series 的程序 # 定义二维数组 Data =[[2, 3, 4], [5, 6, 7]] # 创建一系列二维数组 snd = pd.Series(Data)
Output:
Data as Ndarray
DataFrames is a two-dimensional (2-D) data structure defined in pandas, consisting of rows and columns.
Create DataFrame
# 创建 DataFrame 的程序 # 导入库 import pandas as pd # 使用数据创建 DataFrame a = pd.DataFrame(Data)
Here, the data can be:
When the data is a dictionary
# 使用两个字典创建数据框的程序 # 定义字典 1 dict1 ={'a':1, 'b':2, 'c':3, 'd':4} # 定义字典 2 dict2 ={'a':5, 'b':6, 'c':7, 'd':8, 'e':9} # 用 dict1 和 dict2 定义数据 Data = {'first':dict1, 'second':dict2} # 创建数据框 df = pd.DataFrame(Data)
Output:
DataFrame with two dictionaries
When the data is a Series
# 创建三个系列的Dataframe的程序 import pandas as pd # 定义 series 1 s1 = pd.Series([1, 3, 4, 5, 6, 2, 9]) # 定义 series 2 s2 = pd.Series([1.1, 3.5, 4.7, 5.8, 2.9, 9.3]) # 定义 series 3 s3 = pd.Series(['a', 'b', 'c', 'd', 'e']) # 定义 Data Data ={'first':s1, 'second':s2, 'third':s3} # 创建 DataFrame dfseries = pd.DataFrame(Data)
Output:
DataFrame of three Series
When Data is 2D-numpy ndarrayNote : One constraint must be maintained when creating a DataFrame of 2D arrays - the dimensions of the 2D arrays must be the same.
# 从二维数组创建 DataFrame 的程序 # 导入库 import pandas as pd # 定义 2d 数组 1 d1 =[[2, 3, 4], [5, 6, 7]] # 定义 2d 数组 2 d2 =[[2, 4, 8], [1, 3, 9]] # 定义 Data Data ={'first': d1, 'second': d2} # 创建 DataFrame df2d = pd.DataFrame(Data)
Output:
DataFrame with 2d ndarray
[Related recommendations:Python3 Video tutorial】
The above is the detailed content of Detailed explanation of how Python uses Pandas for data analysis. For more information, please follow other related articles on the PHP Chinese website!