Home >Backend Development >Python Tutorial >Remove duplicates in DF and convert to JSON obj in python
I have a df similar to the one below
name series ============================= a a1 b b1 a a2 a a1 b b2
I need to convert the series into a list which should be assigned to each name like dictionary or json obj like below
{ "a": ["a1", "a2"], "b": ["b1", "b2"] }
So far I have tried using groupby but it just groups everything into a single dictionary
test = df.groupby("series")[["name"]].apply(lambda x: x)
The above code gives a df-like output
Series Name A 0 A1 2 A2 3 A1 B 1 B1 4 B2
Any help is greatly appreciated
Thank you
Firstdrop_duplicates
Make sure there is, thengroupby. agg
as a list:
out = df.drop_duplicates().groupby('name')['series'].agg(list).to_dict()
Or dial unique
:
out = df.groupby('name')['series'].agg(lambda x: x.unique().tolist()).to_dict()
Output: {'a': ['a1', 'a2'], 'b': ['b1', 'b2']}
If you have additional columns, make sure to keep only the columns of interest:
out = (df[['name', 'series']].drop_duplicates() .groupby('name')['series'].agg(list).to_dict() )
out = (df.groupby('name')['series'] .agg(lambda x: sorted(x.unique().tolist())).to_dict() )
Example:
# input Name Series 0 A Z1 1 B B1 2 A A2 3 A Z1 4 B B2 # output {'A': ['A2', 'Z1'], 'B': ['B1', 'B2']}
The above is the detailed content of Remove duplicates in DF and convert to JSON obj in python. For more information, please follow other related articles on the PHP Chinese website!