Home >Backend Development >Python Tutorial >How do I transform a data table in Pandas with the 'Indicator' column values becoming new columns?
The task involves transposing a data table in CSV format, where the "Indicator" column values become the new columns. The desired result is a flattened format where rows are defined by 'Country' and 'Year', and columns are 'Indicator' values.
To achieve the pivot operation, one can utilize the .pivot method as follows:
out = df.pivot(index=['Country', 'Year'], columns='Indicator', values='Value')
This method rearranges the data such that the 'Country' and 'Year' become the row indices and the 'Indicator' values become the columns. The resulting 'out' variable holds the pivoted data.
To return the data to a flat table format, one can use .rename_axis to eliminate the 'Indicator' label from columns and use .reset_index to restore 'Country' and 'Year' as columns.
print(out.rename_axis(columns=None).reset_index())
This produces a flattened table with 'Country', 'Year', and 'Indicator' values as columns.
In case of duplicate 'Country', 'Year', and 'Indicator' combinations in the original dataset, .pivot_table can be employed. It performs aggregation (mean by default) on duplicate values.
out = df.pivot_table( index=['Country', 'Year'], columns='Indicator', values='Value') print(out.rename_axis(columns=None).reset_index())
This approach results in a flattened table where duplicate values are averaged and the 'Indicator' label is omitted from columns.
For further details on reshaping and pivot tables in Pandas, refer to the following resources:
The above is the detailed content of How do I transform a data table in Pandas with the 'Indicator' column values becoming new columns?. For more information, please follow other related articles on the PHP Chinese website!