Home >Backend Development >Python Tutorial >What is astype() function in Python
The astype() function is a powerful method in Python, primarily used in the pandas library for converting a column or a dataset in a DataFrame or Series to a specific data type. It is also available in NumPy for casting array elements to a different type.
The astype() function is used to cast the data type of a pandas object (like a Series or DataFrame) or a NumPy array into another type.
DataFrame.astype(dtype, copy=True, errors='raise')
ndarray.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)
The target data type to which you want to convert the data. This can be specified using:
import pandas as pd # Example DataFrame df = pd.DataFrame({'A': ['1', '2', '3'], 'B': [1.5, 2.5, 3.5]}) # Convert column 'A' to integer df['A'] = df['A'].astype(int) print(df.dtypes)
Output:
A int64 B float64 dtype: object
# Convert multiple columns df = df.astype({'A': float, 'B': int}) print(df.dtypes)
Output:
DataFrame.astype(dtype, copy=True, errors='raise')
ndarray.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)
Output:
import pandas as pd # Example DataFrame df = pd.DataFrame({'A': ['1', '2', '3'], 'B': [1.5, 2.5, 3.5]}) # Convert column 'A' to integer df['A'] = df['A'].astype(int) print(df.dtypes)
A int64 B float64 dtype: object
Output:
# Convert multiple columns df = df.astype({'A': float, 'B': int}) print(df.dtypes)
A float64 B int64 dtype: object
Output:
df = pd.DataFrame({'A': ['1', 'two', '3'], 'B': [1.5, 2.5, 3.5]}) # Attempt conversion with errors='ignore' df['A'] = df['A'].astype(int, errors='ignore') print(df)
A B 0 1 1.5 1 two 2.5 2 3 3.5
Output:
import numpy as np # Example array arr = np.array([1.1, 2.2, 3.3]) # Convert to integer arr_int = arr.astype(int) print(arr_int)
[1 2 3]
Before Optimization (Original Memory Usage):
arr = np.array([1.1, 2.2, 3.3]) # Attempt an unsafe conversion try: arr_str = arr.astype(str, casting='safe') except TypeError as e: print(e)
After Optimization (Optimized Memory Usage):
Cannot cast array data from dtype('float64') to dtype('<U32') according to the rule 'safe'
Original Memory Usage:
Optimized Memory Usage:
df = pd.DataFrame({'A': ['2022-01-01', '2023-01-01'], 'B': ['True', 'False']}) # Convert to datetime and boolean df['A'] = pd.to_datetime(df['A']) df['B'] = df['B'].astype(bool) print(df.dtypes)
Silent Errors with errors='ignore': Use with caution as it may silently fail to convert.
Loss of Precision: Converting from a higher-precision type (e.g., float64) to a lower-precision type (e.g., float32).
A datetime64[ns] B bool dtype: object
Output:
import pandas as pd # Original DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [1.1, 2.2, 3.3]}) print("Original memory usage:") print(df.memory_usage()) # Downcast numerical types df['A'] = df['A'].astype('int8') df['B'] = df['B'].astype('float32') print("Optimized memory usage:") print(df.memory_usage())
Index 128 A 24 B 24 dtype: int64
Output:
DataFrame.astype(dtype, copy=True, errors='raise')
The astype() function is a versatile tool for data type conversion in both pandas and NumPy. It allows fine-grained control over casting behavior, memory optimization, and error handling. Proper use of its parameters, such as errors in pandas and casting in NumPy, ensures robust and efficient data type transformations.
The above is the detailed content of What is astype() function in Python. For more information, please follow other related articles on the PHP Chinese website!