Home >Backend Development >Golang >Convert parquet file to Golang structure with nested elements
php Editor Shinichi will introduce how to convert parquet files into Golang structures with nested elements. Parquet is an efficient columnar storage format, and Golang is a powerful programming language. Combining them can help us better process and analyze large amounts of data. By using appropriate libraries and techniques, we can easily parse parquet files into Golang structures and can handle nested elements for better organization and manipulation of data. This article will introduce the implementation steps and precautions in detail to help readers get started easily.
I'm trying to read a parquet file with nested arrays/structures in go using the xitongsys/parquet-go library. The list data is not read and no values are seen. Below is my structure in golang
type Play struct { SID string `parquet:"name=si, type=BYTE_ARRAY, convertedtype=UTF8, encoding=PLAIN_DICTIONARY, repetitiontype=OPTIONAL" json:"si,omitempty"` TimeStamp int `parquet:"name=ts, type=INT64, repetitiontype=OPTIONAL" json:"ts,omitempty"` SingleID int `parquet:"name=sg, type=INT64, repetitiontype=OPTIONAL" json:"sg,omitempty"` PID int `parquet:"name=playid, type=INT64, repetitiontype=OPTIONAL" json:"playid,omitempty"` StartTimeStamp string `parquet:"name=startts, type=BYTE_ARRAY,repetitiontype=OPTIONAL"` Price []Price1 `parquet:"name=price, type=LIST, repetitiontype=REQUIRED" json:"price,omitempty"` } type Price1 struct { CurrID int `parquet:"name=currId, type=INT64, repetitiontype=REQUIRED" json:"currId,omitempty"` LPTag string `parquet:"name=lptag, type=BYTE_ARRAY,convertedtype=UTF8, repetitiontype=REQUIRED" json:"lptag,omitempty"` LPrice Money `parquet:"name=lpmoney, type=STRUCT" json:"lpmoney,omitempty"` } type Money struct { AdmCurrCode string `parquet:"name=admCC, type=BYTE_ARRAY, repetitiontype=OPTIONAL" json:"admCC,omitempty"` AdmCurrValue string `parquet:"name=admCV, type=BYTE_ARRAY" json:"admCV,omitempty"` }
currid and lptag are empty even though the parquet file has valid values
I found thatgithub.com/segmentio/parquet-go
the package can be correct Read the file. Do you need to stick with the github.com/xitongsys/parquet-go
package?
package main import ( "fmt" "github.com/segmentio/parquet-go" ) type Play struct { SID string `parquet:"si"` TimeStamp int `parquet:"ts"` SingleID int `parquet:"sg"` PID int `parquet:"playid"` StartTimeStamp string `parquet:"startts"` Price []Price `parquet:"price,list"` } type Price struct { CurrID int `parquet:"currId"` LPTag string `parquet:"lptag"` LPrice Money `parquet:"lpmoney"` } type Money struct { AdmCurrCode string `parquet:"admCC"` AdmCurrValue string `parquet:"admCV"` } func main() { rows, err := parquet.ReadFile[Play]("s3.parquet") if err != nil { panic(err) } for _, c := range rows { fmt.Printf("%+v\n", c) } }
The above is the detailed content of Convert parquet file to Golang structure with nested elements. For more information, please follow other related articles on the PHP Chinese website!