Home >Backend Development >Golang >How to Efficiently Decode JSON Streams without Loading the Entire Payload into Memory?

How to Efficiently Decode JSON Streams without Loading the Entire Payload into Memory?

Barbara Streisand
Barbara StreisandOriginal
2024-12-25 18:44:14941browse

How to Efficiently Decode JSON Streams without Loading the Entire Payload into Memory?

Decoding JSON Streams without Reading Entire Payload

In this scenario, we encounter the need to decode JSON data received via HTTP streaming without loading the entire response into memory. The objective is to process individual JSON items (represented as "large objects" within an array) and dispatch them to a message queue as they are received.

Event-Driven JSON Parsing

To achieve this, we employ the json.Decoder and its Decode() and Token() methods. Decode() can be used to unmarshal a single value, while Token() allows us to parse only the next token in the JSON stream, enabling us to process the data incrementally.

On-the-Fly Token Processing

We establish a "state machine" to keep track of our position within the JSON structure. By analyzing each token, we navigate through the object hierarchy, identifying the "items" array and its large object elements.

Code Implementation

package main

import (
    "bufio"
    "encoding/json"
    "fmt"
    "io"
    "log"
    "os"
)

// Helper error handler
func he(err error) {
    if err != nil {
        log.Fatal(err)
    }
}

// Large object struct
type LargeObject struct {
    Id   string `json:"id"`
    Data string `json:"data"`
}

func main() {
    // JSON input for demonstration
    in := `{"somefield": "value", "otherfield": "othervalue", "items": [ {"id": "1", "data": "data1"}, {"id": "2", "data": "data2"}, {"id": "3", "data": "data3"}, {"id": "4", "data": "data4"}]}`

    dec := json.NewDecoder(strings.NewReader(in))

    // Expect an object
    t, err := dec.Token()
    he(err)
    if delim, ok := t.(json.Delim); !ok || delim != '{' {
        log.Fatal("Expected object")
    }

    // Read props
    for dec.More() {
        t, err = dec.Token()
        he(err)
        prop := t.(string)
        if t != "items" {
            var v interface{}
            he(dec.Decode(&v))
            fmt.Printf("Property '%s' = %v\n", prop, v)
            continue
        }

        // It's the "items". Expect an array
        t, err = dec.Token()
        he(err)
        if delim, ok := t.(json.Delim); !ok || delim != '[' {
            log.Fatal("Expected array")
        }
        // Read items (large objects)
        for dec.More() {
            // Read next item (large object)
            lo := LargeObject{}
            he(dec.Decode(&lo))
            fmt.Printf("Item: %+v\n", lo)
        }
        // Array closing delim
        t, err = dec.Token()
        he(err)
        if delim, ok := t.(json.Delim); !ok || delim != ']' {
            log.Fatal("Expected array closing")
        }
    }

    // Object closing delim
    t, err = dec.Token()
    he(err)
    if delim, ok := t.(json.Delim); !ok || delim != '}' {
        log.Fatal("Expected object closing")
    }
}

Sample Output

Property 'somefield' = value
Property 'otherfield' = othervalue
Item: {Id:1 Data:data1}
Item: {Id:2 Data:data2}
Item: {Id:3 Data:data3}
Item: {Id:4 Data:data4}

By using this event-driven parsing approach, we can effectively process large JSON responses incrementally, avoiding excessive memory consumption.

The above is the detailed content of How to Efficiently Decode JSON Streams without Loading the Entire Payload into Memory?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn