Home >Backend Development >Python Tutorial >What is the difference between data frames and matrices in Python Pandas?
In this article we will show you the difference between dataframe and matrix in python panda.
Data frames and matrices are both two-dimensional data structures. Generally speaking, a data frame can contain multiple types of data (numbers, characters, factors, etc.), while a matrix can only store one type of data.
In Python, DataFrame is a two-dimensional, tabular, mutable data structure that can store tabular data containing objects of various data types. DataFrame has axes labeled in rows and columns. DataFrames are useful tools in data preprocessing because they provide valuable data processing methods. DataFrame can also be used to create pivot tables and plot data using Matplotlib.
Data frames can perform a variety of tasks, such as fitting statistical formulas.
Data processing (Matrix is not possible, must be converted to data frame first)
Convert rows to columns and vice versa, which is very useful in data science.
The following are the algorithms/steps that need to be followed to perform the required task -
Use the import keyword to import the pandas and numpy modules with aliases.
Use the DataFrame() function of the pandas module to create a data frame.
Print the input data frame.
The following program uses the DataFrame() function to return a data frame -
# importing pandas, numpy modules with alias names import pandas as pd import numpy as np # creating a dataframe inputDataframe = pd.DataFrame({'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'], 'Jobrole': ['Developer', 'Analyst', 'Help Desk', 'Database Developer', 'Finance accountant'], 'Age': [25, 30, 28, 25, 40]}) # displaying the dataframe print(inputDataframe)
When executed, the above program will generate the following output -
Name Jobrole Age 0 Virat Developer 25 1 Rohit Analyst 30 2 Meera Help Desk 28 3 Nick Database Developer 25 4 Sana Finance accountant 40
A matrix is a collection of homogeneous data sets organized in a two-dimensional rectangular grid. It is an m*n array with the same data type. It is created with vector input. There are a fixed number of rows and columns. Python supports various arithmetic operations such as addition, subtraction, multiplication, and division on Matrix.
It is useful in economics for calculating statistics such as GDP (gross domestic product) or PI (price per capita income).
It is also useful for studying electrical and electronic circuits.
Print the input data frame.
Matrix is used for research, such as drawing diagrams.
This is useful in probability and statistics.
The following are the algorithms/steps that need to be followed to perform the required task -
Use the import keyword to import the pandas module with an alias.
Create two variables to store the two input matrices respectively.
Use the pandas module's DataFrame() function (Create DataFrame) to create data frames for the first and second matrices and store them in separate variables. This data is loaded into pandas DataFrames.
Print the data frame of input matrix 1.
Print the dimensions (shape) of input matrix 1 by applying the shape attribute.
Print the data frame of input matrix 2.
Print the dimensions (shape) of input matrix 2 by applying the shape attribute.
Use the dot() function to multiply the matrices inputMatrix_1 and inputMatrix_2 and create a variable to store it.
Print the result matrix of the multiplication of inputMatrix_1 and inputMatrix_2 matrices.
Print the dimensions (shape) of the resulting matrix by applying the shape attribute.
The following program uses the DataFrame() function to return a data frame -
# importing pandas module import pandas as pd # input matrix 1 inputMatrix_1 = [[1, 2, 2], [1, 2, 0], [1, 0, 2]] # input matrix 2 inputMatrix_2 = [[1, 0, 1], [2, 1, 1], [2, 1, 2]] # creating a dataframe of first matrix #(here data is loaded into a pandas DataFrames) df_1 = pd.DataFrame(data=inputMatrix_1) # creating a dataframe of second matrix df_2 = pd.DataFrame(data=inputMatrix_2) # printing the dataframe of input matrix 1 print("inputMatrix_1:") print(df_1) # printing the dimensions(shape) of input matrix 1 print("The dimensions(shape) of input matrix 1:") print(df_1.shape) print() # printing the dataframe of input matrix 2 print("inputMatrix_2:") print(df_2) # printing the dimensions(shape) of input matrix 1 print("The dimensions(shape) of input matrix 2:") print(df_2.shape) print() # multiplying both the matrices inputMatrix_1 and inputMatrix_2 result_mult = df_1.dot(df_2) # Printing the resultant of matrix multiplication of inputMatrix_1 and inputMatrix_2 print("Resultant Matrix after Matrix multiplication:") print(result_mult) # printing the dimensions(shape) of resultant Matrix print("The dimensions(shape) of Resultant Matrix:") print(result_mult.shape)
inputMatrix_1: 0 1 2 0 1 2 2 1 1 2 0 2 1 0 2 The dimensions(shape) of input matrix 1: (3, 3) inputMatrix_2: 0 1 2 0 1 0 1 1 2 1 1 2 2 1 2 The dimensions(shape) of input matrix 2: (3, 3) Resultant Matrix after Matrix multiplication: 0 1 2 0 9 4 7 1 5 2 3 2 5 2 5 The dimensions(shape) of Resultant Matrix: (3, 3)
The following is the difference table between matrix and data frame.
matrix | Data frame |
---|---|
It is a collection of data sets arranged in a two-dimensional rectangular organization | It stores data tables with multiple data types in multiple columns called fields. |
The matrix is an m*n array with the same data type | A data frame is a list of vectors of the same length. A data frame is a generalized form of a matrix. |
A matrix has a fixed number of rows and columns. | The number of rows and columns of Dataframe is variable. |
homogeneous | Heterogeneous |
We learned about the difference between matrices and data frames in Python in this program. We also learned how to make a data frame and how to convert a matrix into a data frame.
The above is the detailed content of What is the difference between data frames and matrices in Python Pandas?. For more information, please follow other related articles on the PHP Chinese website!