Home >Backend Development >Python Tutorial >How to Efficiently Join Grouped Values in Pandas with a Delimiter?
Joining Grouped Values with a Delimiter in Pandas
When using the groupby function to group data with multiple values, it's common to encounter the issue of concatenating these values without a delimiter. To resolve this, you can leverage the agg method.
Consider the following DataFrame:
col | val -----|----- A | Cat A | Tiger B | Ball B | Bat
To group these rows based on the col column and concatenate the values in the val column, use the following code:
import pandas as pd df = pd.DataFrame({'col': ['A', 'A', 'B', 'B'], 'val': ['Cat', 'Tiger', 'Ball', 'Bat']}) grouped = df.groupby('col')['val'].agg('-'.join)
This approach should yield the desired result:
col | val -----|----- A | Cat-Tiger B | Ball-Bat
However, if the apply method is used as an alternative, it can lead to an unexpected outcome with hyphenated values occurring multiple times, as seen below:
df.groupby('col')['val'].apply(lambda x: '-'.join(x)) col | val -----|----- A | C-a-t-T-i-g-e-r B | B-a-l-l-B-a-t
To avoid this issue, use the agg method instead, as demonstrated in the example above.
Additionally, to convert the grouped index or MultiIndex to regular columns, you can use the reset_index method:
df1 = grouped.reset_index(name='new')
The above is the detailed content of How to Efficiently Join Grouped Values in Pandas with a Delimiter?. For more information, please follow other related articles on the PHP Chinese website!