Home >Backend Development >Python Tutorial >How to Perform a Cartesian Product of DataFrames in Pandas?

How to Perform a Cartesian Product of DataFrames in Pandas?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-24 08:19:13309browse

How to Perform a Cartesian Product of DataFrames in Pandas?

Cartesian Product in Pandas

In data manipulation tasks, it is often necessary to combine rows from multiple dataframes into a single dataframe. One way to achieve this is by performing a cartesian product, which generates all possible combinations of rows from the input dataframes.

For Pandas versions >= 1.2, the merge function provides a built-in method for cartesian product calculations. The following code demonstrates its usage:

import pandas as pd
df1 = pd.DataFrame({'col1':[1,2],'col2':[3,4]})
df2 = pd.DataFrame({'col3':[5,6]})    

df1.merge(df2, how='cross')

Output:

   col1  col2  col3
0     1     3     5
1     1     3     6
2     2     4     5
3     2     4     6

For Pandas versions < 1.2, an alternative approach using the merge function is available. In this method, a common key is added to each dataframe to facilitate the join:

import pandas as pd
df1 = pd.DataFrame({'key':[1,1], 'col1':[1,2],'col2':[3,4]})
df2 = pd.DataFrame({'key':[1,1], 'col3':[5,6]})

pd.merge(df1, df2,on='key')[['col1', 'col2', 'col3']]

Output:

   col1  col2  col3
0     1     3     5
1     1     3     6
2     2     4     5
3     2     4     6

The above is the detailed content of How to Perform a Cartesian Product of DataFrames in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn