Home  >  Article  >  Backend Development  >  How to Import and Process Nested JSON Data into Pandas DataFrames?

How to Import and Process Nested JSON Data into Pandas DataFrames?

Linda Hamilton
Linda HamiltonOriginal
2024-10-24 11:40:02801browse

How to Import and Process Nested JSON Data into Pandas DataFrames?

Reading Nested JSON Files as Pandas DataFrames

When working with JSON data containing nested objects, it can be necessary to convert it into a more structured format for analysis or manipulation. Pandas provides useful tools for efficiently handling such data.

Scenario:

Consider a JSON file with the following structure:

<code class="json">{
    "number": "",
    "date": "01.10.2016",
    "name": "R 3932",
    "locations": [
        { ... },
        { ... },
        { ... }
    ]
}</code>

Using json_normalize:

The json_normalize function allows you to flatten nested JSON into a DataFrame. For the given JSON, you can do the following:

<code class="python">import pandas as pd

with open('myJson.json') as data_file:    
    data = json.load(data_file)  

df = pd.json_normalize(data, 'locations', ['date', 'number', 'name'], 
                    record_prefix='locations_')
print (df)</code>

This will create a DataFrame with the following columns:

Extending to Keep Nested Data:

If you prefer to keep the nested array intact, you can use read_json with the parsing parameter. This will parse the JSON into a DataFrame with the locations column as a list of dictionaries.

<code class="python">df = pd.read_json("myJson.json", orient='records', parsing = True)</code>

Alternatively, you can parse the locations column using the constructor parameter:

<code class="python">df = pd.read_json("myJson.json", orient='records',
                  constructor=lambda x: pd.DataFrame(x['locations']))</code>

Concatenating Nested Values:

If you want to join the values in the locations column into a single string, you can use the groupby and apply functions:

<code class="python">df = df.groupby(['date', 'name', 'number'])['locations'].apply(','.join).reset_index()</code>

The above is the detailed content of How to Import and Process Nested JSON Data into Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn