Home  >  Article  >  Backend Development  >  How to Read Nested JSON into a Pandas DataFrame and Manipulate Data Structures?

How to Read Nested JSON into a Pandas DataFrame and Manipulate Data Structures?

Barbara Streisand
Barbara StreisandOriginal
2024-10-24 12:10:02275browse

How to Read Nested JSON into a Pandas DataFrame and Manipulate Data Structures?

Reading Nested JSON as a Pandas DataFrame

To read a JSON file with nested objects as a pandas DataFrame, you can utilize the powerful json_normalize function. This function allows you to flatten nested data structures into a tabular format, making it easier to manipulate and analyze the data.

Expanding Arrays into Columns

Your sample JSON contains an array of locations. Instead of keeping this array as a JSON column, you can expand it into separate columns to gain better insights into the data. json_normalize can achieve this with the meta parameter. It specifies which columns should be unnested and included as regular columns in the DataFrame.

<code class="python">import json

with open('myJson.json') as data_file:    
    data = json.load(data_file)  

df = pd.json_normalize(data, 'locations', ['date', 'number', 'name'], 
                    record_prefix='locations_', meta=['depTime', 'arrTime'])</code>

This code will create a DataFrame with additional columns for depTime and arrTime derived from the locations array.

Joining Locations Column

You mentioned that you want to join the locations column. This can be done using the following code:

<code class="python">df['locations'] = df.locations.apply(','.join)</code>

This will concatenate the locations into a single comma-separated string.

Handling Multiple JSON Objects

If your JSON file contains multiple JSON objects (one per line), you can use the following code:

<code class="python">import pandas as pd

# Read the JSON file into a list of dictionaries
with open('myJson.json') as f:
    data = [json.loads(line) for line in f]

# Convert the list of dictionaries to a DataFrame
df = pd.DataFrame(data)</code>

You can then apply the same techniques described above to normalize and join the nested data.

By leveraging json_normalize, you can efficiently read, flatten, and manipulate nested JSON data into a pandas DataFrame, enhancing your data analysis capabilities.

The above is the detailed content of How to Read Nested JSON into a Pandas DataFrame and Manipulate Data Structures?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn