Home >Backend Development >Python Tutorial >How to Manage Nested JSON Objects as a DataFrame in Pandas?
Reading Nested JSON with Nested Objects as a Pandas DataFrame
When dealing with JSON data containing nested objects, manipulating it efficiently in Python is crucial. Pandas provides a powerful tool to achieve this - json_normalize.
Expanding the Array into Columns
To expand the locations array into separate columns, use json_normalize as follows:
<code class="python">import json import pandas as pd with open('myJson.json') as data_file: data = json.load(data_file) df = pd.json_normalize(data, 'locations', ['date', 'number', 'name'], record_prefix='locations_') print(df)</code>
This will create a dataframe with expanded columns:
locations_arrTime locations_arrTimeDiffMin locations_depTime \ 0 06:32 1 06:37 1 06:40 2 08:24 1 locations_depTimeDiffMin locations_name locations_platform \ 0 0 Spital am Pyhrn Bahnhof 2 1 0 Windischgarsten Bahnhof 2 2 Linz/Donau Hbf 1A-B locations_stationIdx locations_track number name date 0 0 R 3932 R 3932 01.10.2016 1 1 R 3932 01.10.2016 2 22 R 3932 01.10.2016
Handling Multiple JSON Objects
For JSON files containing multiple objects, the approach depends on the desired data structure.
Keep Individual Columns
To keep individual columns (date, number, name, locations), use the following:
<code class="python">df = pd.read_json('myJson.json') df.locations = pd.DataFrame(df.locations.values.tolist())['name'] df = df.groupby(['date', 'name', 'number'])['locations'].apply(','.join).reset_index() print(df)</code>
This will group the data and concatenate the locations:
date name number locations 0 2016-01-10 R 3932 Spital am Pyhrn Bahnhof,Windischgarsten Bahnho...
Flatten the Data Structure
If you prefer a flattened data structure, you can use json_normalize with the following settings:
<code class="python">df = pd.read_json('myJson.json', orient='records', convert_dates=['date']) print(df)</code>
This will output the data in a single table:
number date name ... locations.arrTimeDiffMin locations.depTimeDiffMin locations.platform 0 R 3932 2016-01-10 R 3932 ... 0 0 2 1 R 3932 2016-01-10 R 3932 ... 1 0 2 2 R 3932 2016-01-10 R 3932 ... 1 - 1A-B
The above is the detailed content of How to Manage Nested JSON Objects as a DataFrame in Pandas?. For more information, please follow other related articles on the PHP Chinese website!