Home >Backend Development >Python Tutorial >How do you transform a wide Pandas DataFrame into a long format with values representing variables and dates?

How do you transform a wide Pandas DataFrame into a long format with values representing variables and dates?

Susan Sarandon
Susan SarandonOriginal
2024-11-14 11:17:02241browse

How do you transform a wide Pandas DataFrame into a long format with values representing variables and dates?

Reshaping from Wide Data:

In the realm of data manipulation, reshaping a wide dataset into a long one is a crucial operation for data integration and analysis. Consider the following scenario:

You have a dataframe in pandas with daily values for variables AA, BB, and CC, indexed by dates.

+---------+----+----+----+
| date     | AA | BB | CC |
+---------+----+----+----+
| 05/03    | 1  | 2  | 3  |
| 06/03    | 4  | 5  | 6  |
| 07/03    | 7  | 8  | 9  |
| 08/03    | 5  | 7  | 1  |
+---------+----+----+----+

You wish to transform this data into a format where each row represents a variable and date, as seen below:

+------+---------+--------+
| var  | date    | value  |
+------+---------+--------+
| AA   | 05/03   | 1      |
| AA   | 06/03   | 4      |
| AA   | 07/03   | 7      |
| AA   | 08/03   | 5      |
| BB   | 05/03   | 2      |
| BB   | 06/03   | 5      |
| BB   | 07/03   | 8      |
| BB   | 08/03   | 7      |
| CC   | 05/03   | 3      |
| CC   | 06/03   | 6      |
| CC   | 07/03   | 9      |
| CC   | 08/03   | 1      |
+------+---------+--------+

This restructuring is a typical task in data integration and will enable you to merge this dataframe with another with matching dates and initial column names (AA, BB, CC).

Method: Pandas' Melt Function

Fortunately, pandas offers a straightforward method to perform this transformation: pandas.melt or DataFrame.melt. Here's an example:

import pandas as pd

df = pd.DataFrame({
    'date' : ['05/03', '06/03', '07/03', '08/03'],
    'AA' : [1, 4, 7, 5],
    'BB' : [2, 5, 8, 7],
    'CC' : [3, 6, 9, 1]
})
df.set_index('date', inplace=True)

dfm = df.reset_index().melt(id_vars='date')

This will transform your dataframe into the desired long format:

     date variable  value
0   05/03       AA      1
1   06/03       AA      4
2   07/03       AA      7
3   08/03       AA      5
4   05/03       BB      2
5   06/03       BB      5
6   07/03       BB      8
7   08/03       BB      7
8   05/03       CC      3
9   06/03       CC      6
10  07/03       CC      9
11  08/03       CC      1

The above is the detailed content of How do you transform a wide Pandas DataFrame into a long format with values representing variables and dates?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn