Home >Backend Development >Python Tutorial >How to Create a Pandas DataFrame from a Nested Dictionary with Hierarchical Indexes?

How to Create a Pandas DataFrame from a Nested Dictionary with Hierarchical Indexes?

DDD
DDDOriginal
2024-12-02 03:30:13157browse

How to Create a Pandas DataFrame from a Nested Dictionary with Hierarchical Indexes?

Constructing a Pandas DataFrame from Items in Nested Dictionaries with Hierarchical Indexes

In this scenario, you wish to create a pandas DataFrame from a nested dictionary where the hierarchy consists of:

  • Level 1: User ID
  • Level 2: Category
  • Level 3: Assorted Attributes

The desired DataFrame should have User IDs as the index and categories and attributes as columns.

Leveraging Pandas MultiIndex

One efficient approach utilizes pandas' MultiIndex, which enables the creation of a multi-level index structure. To employ this method:

  1. Reshape the input dictionary to use tuples as keys, aligning with the desired MultiIndex values.
  2. Construct the DataFrame using pd.DataFrame.from_dict, specifying orient='index' to align data with the defined tuple keys.
user_dict = {12: {'Category 1': {'att_1': 1, 'att_2': 'whatever'},
                  'Category 2': {'att_1': 23, 'att_2': 'another'}},
             15: {'Category 1': {'att_1': 10, 'att_2': 'foo'},
                  'Category 2': {'att_1': 30, 'att_2': 'bar'}}}

df = pd.DataFrame.from_dict({(i,j): user_dict[i][j] 
                           for i in user_dict.keys() 
                           for j in user_dict[i].keys()},
                       orient='index')

print(df)



               att_1     att_2
12 Category 1      1  whatever
   Category 2     23   another
15 Category 1     10       foo
   Category 2     30       bar

Method via Concatenation

Alternatively, you can build the DataFrame incrementally through concatenation:

  1. Extract the User IDs and create an empty list to store component dataframes.
  2. Iterate through the dictionary, creating a dataframe for each user and adding it to the list.
  3. Concatenate the component dataframes using pd.concat, indexing by User ID.
user_ids = []
frames = []

for user_id, d in user_dict.iteritems():
    user_ids.append(user_id)
    frames.append(pd.DataFrame.from_dict(d, orient='index'))

df = pd.concat(frames, keys=user_ids)

print(df)


               att_1     att_2
12 Category 1      1  whatever
   Category 2     23   another
15 Category 1     10       foo
   Category 2     30       bar

The above is the detailed content of How to Create a Pandas DataFrame from a Nested Dictionary with Hierarchical Indexes?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn