Home >Backend Development >Python Tutorial >How to Replace NaN Values in a Pandas DataFrame with Column Averages?

How to Replace NaN Values in a Pandas DataFrame with Column Averages?

Linda Hamilton
Linda HamiltonOriginal
2024-10-30 07:01:28191browse

How to Replace NaN Values in a Pandas DataFrame with Column Averages?

Pandas DataFrame: Replacing NaN Values with Column Averages

In pandas DataFrames, handling missing data is crucial for accurate analysis. When encountered with incomplete data, replacing NaN values with meaningful estimates becomes necessary. This article demonstrates how to replace NaN values with the average of their respective columns in a pandas DataFrame.

Problem

Consider a DataFrame with a mixture of real numbers and NaN values. The goal is to replace the NaN values with the average values of the columns in which they appear.

Solution

Unlike in NumPy arrays, filling NaN values in pandas DataFrames can be efficiently handled using the fillna method:

<code class="python">df.fillna(df.mean())</code>

This method fills NaN values with the mean of the corresponding column. For example:

<code class="python">df = pd.DataFrame({'A': [-0.166919, -0.297953, -0.120211, np.nan, np.nan, -0.788073, -0.916080, -0.887858, 1.948430, 0.019698],
                   'B': [0.979728, -0.912674, -0.540679, -2.027325, np.nan, np.nan, -0.612343, 1.033826, 1.025011, -0.795876],
                   'C': [-0.632955, -1.365463, -0.680481, 1.533582, 0.461821, np.nan, np.nan, np.nan, -2.982224, -0.046431]})

mean = df.mean()
print(df.fillna(mean))</code>

Output:

          A         B         C
0 -0.166919  0.979728 -0.632955
1 -0.297953 -0.912674 -1.365463
2 -0.120211 -0.540679 -0.680481
3 -0.151121 -2.027325  1.533582
4 -0.151121 -0.231291  0.461821
5 -0.788073 -0.231291 -0.530307
6 -0.916080 -0.612343 -0.530307
7 -0.887858  1.033826 -0.530307
8  1.948430  1.025011 -2.982224
9  0.019698 -0.795876 -0.046431

The NaN values have been replaced with the average values of their respective columns.

The above is the detailed content of How to Replace NaN Values in a Pandas DataFrame with Column Averages?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn