Home >Backend Development >Golang >Build the schema using the datatypes specified in datatype.go implemented by golang apache arrow

Build the schema using the datatypes specified in datatype.go implemented by golang apache arrow

WBOY
WBOYforward
2024-02-06 08:36:07536browse

使用golang apache arrow实现的datatype.go中指定的数据类型来构建模式

Question content

I am learning apache arrow and would like to learn more about how to create schema and arrow records. I've referenced some material for this, but so far all of it just uses primitive types to build a pattern like this: `

schema := arrow.NewSchema(
    []arrow.Field{
        {Name: "f1-i32", Type: arrow.PrimitiveTypes.Int32},
        {Name: "f2-f64", Type: arrow.PrimitiveTypes.Float64},
    },
    nil,
)

Some data types do not exist in primitivetypes that I want to use. For example, I want to use bool or decimal128. I was looking at the golang arrow library and found the file datatype.go which contains all the possible data types I want to use. But the type here is not the datatype type required when building the schema.

So, I have the following three questions:

  1. If possible, how can I build my schema using these datatypes from datatype.go?
  2. If I want to use a decimal type, how do I specify the precision and number of decimal places?
  3. Examples of using extended types.

Correct Answer


These data type named constants defined in datatype.go have been used to create the new part of the type. Some of them are type decimal128type struct and type booleantype struct If you check the source code of these structures' id methods, they are returned in datatype.go# Constants defined in ## have names similar to the names of structures. These structures already implement the datatype interface, which means you can assign them to arrow.field.type because the type of the field is datatype. What I mean to them is:
A constant defined in
bool datatype.go is used in datatype_fixedwidth.go as the id of a type booleantype struct The return value of the method.
func (t *booleantype) id() type { return bool } The same thing applies to
type decimal128type struct .
func (*decimal128type) id() type { return decimal128 }.

Methods on one of these structures show that they are implementing the

datatype interface:

func (*decimal128type) bitwidth() int
func (t *decimal128type) fingerprint() string
func (*decimal128type) id() type
func (*decimal128type) name() string
func (t *decimal128type) string() string

These methods apply to

type decimal128type struct. And the definition of
datatypeinterface:

type datatype interface {
    id() type
    // name is name of the data type.
    name() string
    fingerprint() string
}

type booleantype struct also implements it.

So you can use them for

type fields:

type field struct {
    name     string   // field name
    type     datatype // the field's data type
    nullable bool     // fields can be nullable
    metadata metadata // the field's metadata, if any
}

Demonstrative example:

package main

import (
    "fmt"

    "github.com/apache/arrow/go/arrow"
)

func main() {
    booltype :=  &arrow.booleantype{}
    decimal128type := &arrow.decimal128type{precision: 1, scale: 1}

    schema := arrow.newschema(
        []arrow.field{
            {name: "f1-bool", type: booltype},
            {name: "f2-decimal128", type: decimal128type},
        },
        nil,
    )

    fmt.println(schema)
}

Output:

schema:
  fields: 2
    - f1-bool: type=bool
    - f2-decimal128: type=decimal(1, 1)

You can find it in

Documentation. There is also something related to extension types.
But I'm not familiar with extension types so I can't show an example of it. But if you are familiar with it, you can solve it easily.

The above is the detailed content of Build the schema using the datatypes specified in datatype.go implemented by golang apache arrow. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete