Home >Backend Development >Python Tutorial >How to Efficiently Convert Pandas DataFrames to NumPy Arrays?

How to Efficiently Convert Pandas DataFrames to NumPy Arrays?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-20 06:15:10159browse

How to Efficiently Convert Pandas DataFrames to NumPy Arrays?

Convert pandas dataframe to NumPy array


Why df.to_numpy() is the recommended method


Using df.to_numpy() is the recommended method because it provides a consistent and reliable way to obtain NumPy arrays from pandas objects. It is defined on Index, Series, and DataFrame objects, and by default, it returns a view of the underlying data, which means that any modifications made to the NumPy array will also be reflected in the pandas object. If a copy of the data is needed, the copy=True parameter can be used.


It's important to note that df.values will not be deprecated in the current version of pandas, but it is recommended to use df.to_numpy() for new code and to migrate towards the newer API as soon as possible.


To preserve the dtypes when converting a pandas dataframe to a NumPy array, the DataFrame.to_records() method can be used.


import pandas as pd<br>import numpy as np</p>
<p>index = [1, 2, 3, 4, 5, 6, 7]<br>a = [np.nan, np.nan, np.nan, 0.1, 0.1, 0.1, 0.1]<br>b = [0.2, np.nan, 0.2, 0.2, 0.2, np.nan, np.nan]<br>c = [np.nan, 0.5, 0.5, np.nan, 0.5, 0.5, np.nan]<br>df = pd.DataFrame({'A': a, 'B': b, 'C': c}, index=index)<br>df = df.rename_axis('ID')</p>
<h1>Convert the DataFrame to a NumPy array with preserved dtypes</h1>
<p>array = df.to_records()</p>
<h1>Print the NumPy array</h1>
<p>print(array)<br>

The output of the code is as follows:


<br>rec.array([('ID', 'index', 'A', 'B', 'C')]</p>
<pre class="brush:php;toolbar:false">           [1, 'a', nan, 0.2, nan],
           [2, 'b', nan, nan, 0.5],
           [3, 'c', nan, 0.2, 0.5],
           [4, 'd', 0.1, 0.2, nan],
           [5, 'e', 0.1, 0.2, 0.5],
           [6, 'f', 0.1, nan, 0.5],
           [7, 'g', 0.1, nan, nan]),
      dtype=[('ID', '<i8'), ('index', 'O'), ('A', '<f8'), ('B', '<f8'), ('C', '<f8')])


As you can see, the NumPy array preserves the dtypes of the columns in the DataFrame.

The above is the detailed content of How to Efficiently Convert Pandas DataFrames to NumPy Arrays?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn