Home >Backend Development >Python Tutorial >How to Unnest List Columns in Pandas DataFrames?

How to Unnest List Columns in Pandas DataFrames?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-30 19:21:10297browse

How to Unnest List Columns in Pandas DataFrames?

Unnesting Columns with Pandas

When dealing with DataFrames that contain columns of lists, it can be useful to "unfold" these lists into separate rows.

Method 1: Using DataFrame.explode() (Pandas >= 0.25)

For single-column explosion, use explode() directly:

df = pd.DataFrame({'A': [1, 2], 'B': [[1, 2], [1, 2]]})

df_exploded = df.explode('B')

Method 2: Apply Series

df_exploded = df.set_index('A').B.apply(pd.Series).stack().reset_index(level=0).rename(columns={0:'B'})

Method 3: Repeat DataFrame

df_exploded = pd.DataFrame({'A':df.A.repeat(df.B.str.len()),'B':np.concatenate(df.B.values)})

Method 4: Reindex/Loc

df_exploded = df.reindex(df.index.repeat(df.B.str.len())).assign(B=np.concatenate(df.B.values))

Method 5: ChainMap

from collections import ChainMap
d = dict(ChainMap(*map(dict.fromkeys, df['B'], df['A'])))
df_exploded = pd.DataFrame(list(d.items()),columns=df.columns[::-1])

Method 6: Numpy

newvalues=np.dstack((np.repeat(df.A.values,list(map(len,df.B.values))),np.concatenate(df.B.values)))
df_exploded = pd.DataFrame(data=newvalues[0],columns=df.columns)

Method 7: Iterators

from itertools import cycle,chain
l=df.values.tolist()
l1=[list(zip([x[0]], cycle(x[1])) if len([x[0]]) > len(x[1]) else list(zip(cycle([x[0]]), x[1]))) for x in l]
df_exploded = pd.DataFrame(list(chain.from_iterable(l1)),columns=df.columns)

Generalization to Multiple Columns

To generalize the above methods for multiple columns, use the following function:

def unnesting(df, explode):
    idx = df.index.repeat(df[explode[0]].str.len())
    df1 = pd.concat([
        pd.DataFrame({x: np.concatenate(df[x].values)}) for x in explode], axis=1)
    df1.index = idx
    
    return df1.join(df.drop(explode, 1), how='left')

Column-Wise Unnesting

To unnest horizontally, modify the function:

def unnesting(df, explode, axis):
    if axis==1:
        # Previous implementation
    else :
        df1 = pd.concat([
                         pd.DataFrame(df[x].tolist(), index=df.index).add_prefix(x) for x in explode], axis=1)
        return df1.join(df.drop(explode, 1), how='left')

The above is the detailed content of How to Unnest List Columns in Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Previous article:The one about wordsNext article:The one about words