Home >Backend Development >Python Tutorial >How can I efficiently split a Pandas DataFrame column of dictionaries into separate columns?
Splitting a Column of Dictionaries into Separate Columns with Pandas
When working with dataframes in Pandas, it is common to encounter columns that contain dictionary values. Splitting these columns into individual columns can improve data organization and accessibility.
Consider the following DataFrame:
Station ID Pollutants 8809 {"a": "46", "b": "3", "c": "12"} 8810 {"a": "36", "b": "5", "c": "8"} 8811 {"b": "2", "c": "7"} 8812 {"c": "11"} 8813 {"a": "82", "c": "15"}
To split the "Pollutants" column into separate "a", "b", and "c" columns, you can use the json_normalize function introduced in Pandas version 0.23.0:
import pandas as pd df2 = pd.json_normalize(df['Pollutants'])
This approach is efficient and avoids the use of potentially costly apply functions. The resulting DataFrame df2 will look like this:
Station ID a b c 8809 46 3 12 8810 36 5 8 8811 NaN 2 7 8812 NaN NaN 11 8813 82 NaN 15
Note that the resulting DataFrame contains null values (NaN) for missing dictionary keys. To handle these cases, you can use the fillna method to replace missing values with default values or apply custom logic.
The above is the detailed content of How can I efficiently split a Pandas DataFrame column of dictionaries into separate columns?. For more information, please follow other related articles on the PHP Chinese website!