Home  >  Article  >  Backend Development  >  Get data type of column in Pandas - Python

Get data type of column in Pandas - Python

PHPz
PHPzforward
2023-08-30 20:01:021335browse

获取Pandas中列的数据类型 - Python

Pandas is a popular and powerful Python library commonly used for data analysis and manipulation. It provides a number of data structures, including Series, DataFrame, and Panel, for working with tabular and time series data.

Pandas DataFrame is a two-dimensional tabular data structure. In this article, we'll cover various ways to determine the data type of a column in Pandas. There are many situations where we have to find the data type of a column in a Pandas DataFrame. Each column in a Pandas DataFrame can contain different data types.

Before proceeding, let us make a sample dataframe on which we have to get the data type of the column in Pandas

import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]})

print(df)

Output

This python script prints the DataFrame we created.

  Vehicle name    price
0        Supra  5000000
1        Honda   600000
2   Lamorghini  7000000

The methods you can take to complete the task are as follows

method

  • Use dtypes attribute

  • Use select_dtypes()

  • Use info() method

  • Use describe() function

Now let us discuss each method and how to use them to get the data type of a column in Pandas.

Method 1: Using dtypes attributes

We can use the dtypes attribute to get the data type of each column in the DataFrame. This property will return a series containing the data type of each column. The following syntax can be used:

Grammar

df.dtypes

Return type The data type of each column in the DataFrame.

algorithm

  • Import the Pandas library.

  • Use the pd.DataFrame() function to create a DataFrame and pass the examples as a dictionary.

  • Use the dtypes attribute to get the data type of each column in the DataFrame.

  • Print the results to check the data type of each column.

Example 1

# import the Pandas library
import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]})

# print the dataframe
print("DataFrame:\n", df)

# get the data types of each column
print("\nData types of each column:")
print(df.dtypes)

Output

DataFrame:
   Vehicle name    price
0        Supra  5000000
1        Honda   600000
2   Lamorghini  7000000

Data types of each column:
Vehicle name    object
price            int64
dtype: object

Example 2

In this example, we get the data type of a single column of the DataFrame

# import the Pandas library
import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]})

# print the dataframe
print("DataFrame:\n", df)

# get the data types of column named price
print("\nData types of column named price:")
print(df.dtypes['price'])

Output

DataFrame:
   Vehicle name    price
0        Supra  5000000
1        Honda   600000
2   Lamorghini  7000000

Data types of column named price:
int64

Method 2: Use select_dtypes()

We can use the select_dtypes() method to filter out the data type columns we need. The select_dtypes() method returns a subset of columns based on the data types provided as input. This method allows us to select columns that belong to a specific data type and then determine the data type.

algorithm

  • Import the Pandas library.

  • Use the pd.DataFrame() function to create a DataFrame and pass the given data as a dictionary.

  • Print the DataFrame to check the created data.

  • Use the select_dtypes() method to select all numeric columns from the DataFrame. Use the include parameter to pass the list of data types we want to select as parameters.

  • Loop over the columns to iterate over each numeric column and print its data type.

Example

# import the Pandas library
import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]})

# print the dataframe
print("DataFrame:\n", df)

# select the numeric columns
numeric_cols = df.select_dtypes(include=['float64', 'int64']).columns

# get the data type of each numeric column
for col in numeric_cols:
    print("Data Type of column", col, "is", df[col].dtype)

Output

DataFrame:
   Vehicle name    price
0        Supra  5000000
1        Honda   600000
2   Lamorghini  7000000
Data Type of column price is int64

Method 3: Use the info() method

We can also use the info() method to complete our tasks. The info() method gives us a concise summary of the DataFrame, including the data type of each column. The following syntax can be used:

Grammar

DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)

Return valueNone

algorithm

  • Import the Pandas library.

  • Use the pd.DataFrame() function to create a DataFrame and pass the above data as a dictionary.

  • Print the DataFrame to check the created data.

  • Use the info() method to get information about the DataFrame.

  • Print the information obtained from the info() method.

Example

# import the Pandas library
import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]})

# print the dataframe
print("DataFrame:\n", df)

# use the info() method to get the data type of each column
print(df.info())

Output

DataFrame:
   Vehicle name    price
0        Supra  5000000
1        Honda   600000
2   Lamorghini  7000000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Vehicle name  3 non-null      object
 1   price         3 non-null      int64 
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes
None

Method 4: Use describe() function

describe() method is used to generate descriptive statistics of DataFrame, including the data type of each column.

algorithm

  • Use the import statement to import the Pandas library.

  • Use the pd.DataFrame() function to create a DataFrame and pass the given data as a dictionary.

  • Print the DataFrame to check the created data.

  • Use the describe() method to obtain the descriptive statistics of the DataFrame.

  • Use the include parameter of the describe() method to 'all' to include all columns in the descriptive statistics.

  • Use the dtypes attribute to get the data type of each column in the DataFrame.

  • Print the data type of each column.

Example

# import the Pandas library
import pandas as pd

# create a sample dataframe
df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]})

# print the dataframe
print("DataFrame:\n", df)

# use the describe() method to get the descriptive statistics of the dataframe
desc_stats = df.describe(include='all')

# get the data type of each column 
dtypes = desc_stats.dtypes

# print the data type of each column
print("Data type of each column in the descriptive statistics:\n", dtypes)

Output

DataFrame:
   Vehicle name    price
0        Supra  5000000
1        Honda   600000
2   Lamorghini  7000000
Data type of each column in the descriptive statistics:
 Vehicle name     object
price           float64
dtype: object

in conclusion

Knowing how to obtain the data type of each column, we can efficiently complete various data operations and analysis work. Each method has its own advantages and disadvantages depending on the method or function used. You can choose which method you want based on how complex you want the expression to be and your personal coding preferences.

The above is the detailed content of Get data type of column in Pandas - Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:tutorialspoint.com. If there is any infringement, please contact admin@php.cn delete