Home >Backend Development >Golang >Essential skills for Golang developers: Easily connect to Baidu AI interface to achieve speech recognition

Essential skills for Golang developers: Easily connect to Baidu AI interface to achieve speech recognition

WBOY
WBOYOriginal
2023-08-12 15:21:03759browse

Essential skills for Golang developers: Easily connect to Baidu AI interface to achieve speech recognition

Must-have skills for Golang developers: Easily connect to Baidu AI interface to achieve speech recognition

Introduction: With the rapid development of artificial intelligence, speech recognition technology is gradually penetrating into In our lives, it has become one of the important ways of our daily communication and interaction. As a Golang developer, knowing how to connect to Baidu AI interface for speech recognition will add a lot of convenience to our application development. This article will lead readers to understand how to use Golang to easily connect to Baidu AI interface to achieve speech recognition, and attaches code examples.

  1. Register Baidu AI developer account
    Before we start, we need to register a Baidu AI developer account. On the Baidu AI open platform (https://ai.baidu.com/), click the "Register Now" button, fill in the relevant information and successfully register an account. After logging in, create an application in the "Console" and obtain the API Key and Secret Key.
  2. Install Golang development environment
    Make sure that the Golang development environment has been installed and configured correctly. You can download the installation package suitable for your operating system from the official website (https://golang.org/dl/), and then install and configure it according to the official documentation.
  3. Install necessary dependency packages
    Before starting to write code, we need to install some necessary dependency packages to facilitate our HTTP requests and JSON parsing. Open a terminal or command line tool and use the following command to install:
go get -u github.com/go-resty/resty/v2
go get -u github.com/json-iterator/go
  1. Write code to implement speech recognition function
    First, we need to create a Go file, such as speech_recognition. go, write the following code in the file:
package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
    "strings"

    "github.com/go-resty/resty/v2"
    "github.com/json-iterator/go"
)

const (
    TokenURL     = "https://aip.baidubce.com/oauth/2.0/token"
    APIURL       = "http://vop.baidu.com/server_api"
    APIKey       = "your_api_key"    // 替换成你的API Key
    SecretKey    = "your_secret_key" // 替换成你的Secret Key
    AudioFile    = "audio.wav"       // 替换成你的音频文件路径
    DevUserID    = "user01"          // 替换成你的用户标识
)

type TokenResponse struct {
    AccessToken string `json:"access_token"`
    ExpiresIn   int    `json:"expires_in"`
}

type RecognitionResult struct {
    ErrNo  int      `json:"err_no"`
    ErrMsg string   `json:"err_msg"`
    Result []string `json:"result"`
}

func main() {
    accessToken := getAccessToken()

    audioData, err := ioutil.ReadFile(AudioFile)
    if err != nil {
        fmt.Printf("读取音频文件失败:%s
", err.Error())
        os.Exit(1)
    }

    boundary := "12345678901234567890"
    body := fmt.Sprintf("--%s
Content-Disposition: form-data; name="dev_pid"

1537
--%s
Content-Disposition: form-data; name="format"

wav
--%s
Content-Disposition: form-data; name="channel"

1
--%s
Content-Disposition: form-data; name="token"

%s
--%s
Content-Disposition: form-data; name="cuid"

%s
--%s
Content-Disposition: form-data; name="len"

%d
--%s
Content-Disposition: form-data; name="speech"; filename="%s"
Content-Type: application/octet-stream

%s
--%s--",
        boundary, boundary, boundary, boundary, accessToken, boundary, DevUserID, boundary, len(audioData), AudioFile, audioData, boundary)
    resp, err := resty.New().R().
        SetHeader("Content-Type", "multipart/form-data; boundary="+boundary).
        SetBody(body).
        Post(APIURL)
    if err != nil {
        fmt.Printf("请求百度AI接口失败:%s
", err.Error())
        os.Exit(1)
    }

    result := RecognitionResult{}
    if err := jsoniter.Unmarshal(resp.Body(), &result); err != nil {
        fmt.Printf("解析返回结果失败:%s
", err.Error())
        os.Exit(1)
    }

    if result.ErrNo != 0 {
        fmt.Printf("识别失败:%s
", result.ErrMsg)
    } else {
        text := strings.Join(result.Result, "")
        fmt.Printf("识别结果:%s
", text)
    }
}

func getAccessToken() string {
    resp, err := resty.New().R().
        SetQueryParams(map[string]string{
            "grant_type":    "client_credentials",
            "client_id":     APIKey,
            "client_secret": SecretKey,
        }).
        Get(TokenURL)
    if err != nil {
        fmt.Printf("获取百度AI接口Token失败:%s
", err.Error())
        os.Exit(1)
    }

    token := TokenResponse{}
    if err := jsoniter.Unmarshal(resp.Body(), &token); err != nil {
        fmt.Printf("解析Token失败:%s
", err.Error())
        os.Exit(1)
    }

    return token.AccessToken
}
  1. Replace configuration parameters
    In the code, we need to replace it with our own API Key, Secret Key, and audio file path and user ID. API Key and Secret Key can be found in the application created on Baidu AI console. The audio file path is the path of the audio file to be recognized. The user ID is a custom string used to distinguish different users.
  2. Compile and run the code
    After you finish writing the code, use the following command to compile and run:
go build speech_recognition.go
./speech_recognition
  1. Result verification
    After running the program, if everything Normally, you will be able to see the recognition results output by the console. If recognition fails, check whether the configuration parameters are correct and whether the audio file exists.

Summary: This article introduces how to use Golang to easily connect to Baidu AI interface to achieve speech recognition, and provides corresponding code examples. By mastering this skill, Golang developers can use Baidu AI interface to develop speech recognition applications more flexibly and conveniently. I hope this article can provide some help and inspiration to Golang developers in implementing speech recognition functions.

The above is the detailed content of Essential skills for Golang developers: Easily connect to Baidu AI interface to achieve speech recognition. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn