Home >Backend Development >Python Tutorial >How can I reshape long data into a wide format with multiple variables using Pandas?

How can I reshape long data into a wide format with multiple variables using Pandas?

Barbara Streisand
Barbara StreisandOriginal
2024-10-30 07:38:27635browse

How can I reshape long data into a wide format with multiple variables using Pandas?

Reshape Long Data into Wide Format with Pandas

When working with data in a long format, it can be necessary to reshape it into a wide format for better analysis and visualization. One common challenge is to reshape data based on multiple variables.

Consider the following dataframe:

salesman  height  product  price
Knut      6        bat          5
Knut      6        ball         1
Knut      6        wand         3
Steve     5        pen          2

The goal is to reshape this data into a wide format:

salesman  height    product_1  price_1  product_2 price_2 product_3 price_3  
Knut      6        bat          5       ball      1        wand      3
Steve     5        pen          2        NA       NA        NA       NA

While melt/stack/unstack are commonly used for reshaping data, they may not be suitable for this specific scenario.

A solution to this problem can be found using the following code:

<code class="python">import pandas as pd

# Create sample data
raw_data = {
    'salesman': ['Knut', 'Knut', 'Knut', 'Steve'],
    'height': [6, 6, 6, 5],
    'product': ['bat', 'ball', 'wand', 'pen'],
    'price': [5, 1, 3, 2]
}

df = pd.DataFrame(raw_data)

# Reshape data
df_wide = df.pivot_table(index=['salesman', 'height'], columns='product', values='price')

# Reset index to get it in the desired format
df_wide = df_wide.reset_index(level=[0, 1])

# Rename columns
new_columns = ['salesman', 'height'] + [f'product_{i}' for i in range(1, df_wide.shape[1] - 1)] + [f'price_{i}' for i in range(1, df_wide.shape[1] - 1)]
df_wide.columns = new_columns

# Handle missing values
df_wide.fillna("NA", inplace=True)</code>

The resulting dataframe df_wide will be in the desired wide format.

The above is the detailed content of How can I reshape long data into a wide format with multiple variables using Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn