Home >Backend Development >Python Tutorial >How to Pivot a Dataframe Using Pandas?

How to Pivot a Dataframe Using Pandas?

Patricia Arquette
Patricia ArquetteOriginal
2024-11-21 02:10:14191browse

How to Pivot a Dataframe Using Pandas?

How to Pivot a Dataframe Using Pandas

Reshaping tabular data is an essential task in data analysis. Pivoting, a technique for transposing rows and columns in a dataframe, is often useful for creating pivot tables and exploring data from different perspectives. Let's explore how to perform this operation in Pandas, a powerful data manipulation library.

To pivot a dataframe, primarily use the .pivot method. This method takes several arguments:

  1. index: Specifies the column(s) to become the index of the pivoted dataframe.
  2. columns: Indicates the column(s) to become the column headers of the pivoted dataframe.
  3. values: Denotes the column(s) whose values should be used to populate the pivot table.

For example, consider the following dataframe:

Indicator  Country  Year  Value
1          Angola   2005  6
2          Angola   2005  13
3          Angola   2005  10
4          Angola   2005  11
5          Angola   2005  5
1          Angola   2006  3
2          Angola   2006  2
3          Angola   2006  7
4          Angola   2006  3
5          Angola   2006  6

To pivot this dataframe so that the values in the Indicator column become the new columns, use the following code:

out = df.pivot(index=['Country', 'Year'], columns='Indicator', values='Value')
print(out)

This operation will produce the following pivoted dataframe:

Indicator     1   2   3   4  5
Country Year
Angola  2005  6  13  10  11  5
        2006  3   2   7   3  6

To convert the pivoted dataframe back to a flat table, use .rename_axis to remove the Indicator axis and .reset_index to convert Country and Year back to normal columns.

print(out.rename_axis(columns=None).reset_index())

This will result in the original dataframe structure:

  Country  Year  1   2   3   4  5
0  Angola  2005  6  13  10  11  5
1  Angola  2006  3   2   7   3  6

If your data contains duplicate combinations of labels (e.g., Country, Year, Indicator), use .pivot_table. This method takes the mean by default.

out = df.pivot_table(
    index=['Country', 'Year'],
    columns='Indicator',
    values='Value')
print(out.rename_axis(columns=None).reset_index())

This will output a similar pivoted dataframe, but with mean values for duplicate combinations.

For a more detailed overview, refer to the Pandas user guide on Reshaping and pivot tables.

The above is the detailed content of How to Pivot a Dataframe Using Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn