Home >Backend Development >Python Tutorial >How to efficiently convert a Pandas DataFrame with missing values into a NumPy array?
The most efficient method to convert a Pandas dataframe with missing values to a NumPy array is through df.to_numpy(). It offers several advantages over older methods like df.values, including:
<code class="python">import pandas as pd import numpy as np # Create a DataFrame with missing values df = pd.DataFrame({'A': [np.nan, np.nan, 0.1, 0.1, 0.1, 0.1], 'B': [0.2, np.nan, 0.2, 0.2, np.nan, np.nan], 'C': [np.nan, 0.5, 0.5, np.nan, 0.5, np.nan]}) # Convert to a NumPy array with missing values represented as `np.nan` array = df.to_numpy() # Result: # array([[ nan, 0.2, nan], # [ nan, nan, 0.5], # [ 0.1, 0.2, 0.5], # [ 0.1, 0.2, nan], # [ 0.1, nan, 0.5], # [ 0.1, nan, nan]])</code>
While to_numpy doesn't support preserving Dtypes directly, you can use np.rec.fromrecords to achieve this effect.
<code class="python"># Create a DataFrame with mixed data types df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7.2, 8.1, 9.3]}) # Convert to a structured array with preserved Dtypes struct_array = np.rec.fromrecords( df.reset_index(), names=list(df.columns) + ['index'] ) # Result: # rec.array([('a', 1, 4, 7.2), ('b', 2, 5, 8.1), ('c', 3, 6, 9.3)], # dtype=[('index', '<U1'), ('A', '<i8'), ('B', '<i8'), ('C', '<f8')])</code>
The above is the detailed content of How to efficiently convert a Pandas DataFrame with missing values into a NumPy array?. For more information, please follow other related articles on the PHP Chinese website!