Home >Backend Development >Python Tutorial >How to Fill Missing DataFrame Values with Group Means Using `transform`?
Filling Missing Values with Group Mean using Transform
In a DataFrame with missing values, it's common to fill them with a meaningful value. One approach is to calculate the mean value for each group.
Consider the following DataFrame:
df = pd.DataFrame({ "value": [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], "name": ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C'] })
The goal is to fill all "NaN" values with the mean value within their respective "name" groups.
To achieve this, we can use the transform function in combination with the groupby operation. The transform function applies a specified transformation to each group, while the groupby operation splits the DataFrame into groups based on a specific column (in this case, "name").
Here's the solution:
grouped = df.groupby("name").mean() df["value"] = df.groupby("name").transform(lambda x: x.fillna(x.mean()))
The fillna function fills missing values with the specified value (in this case, the mean). The lambda function ensures that the mean is calculated for each group before filling.
The resulting DataFrame will have the missing values filled with the mean value for each group:
name value 0 A 1 1 A 1 2 B 2 3 B 2 4 B 3 5 B 1 6 C 3 7 C 3 8 C 3
The above is the detailed content of How to Fill Missing DataFrame Values with Group Means Using `transform`?. For more information, please follow other related articles on the PHP Chinese website!