Home > Article > Backend Development > How to convert text to speech in golang
With the rapid development of artificial intelligence, voice technology has become an indispensable part of people's daily lives. In many scenarios, we may need to quickly convert text into speech, such as voice reading in education, automatic voice answering in intelligent customer service, voice prompts in car navigation, etc. At this time, the language golang can bring us a more efficient and concise text-to-speech solution. This article will introduce readers to how to use golang to complete text-to-speech.
In golang, we can implement the text-to-speech function through third-party libraries. There are currently two popular libraries on the market, namely go-tts and go-astits. Here, we choose the more mature and stable go-astits.
Use the following command to install the go-astits package in the third library in the src path in your golang installation directory:
go get github.com/mkb218/gosynth/v2
go-astits depends on the speech engine. Currently it supports two speech engines: espeak and festival. Here, we choose to use espeak.
Installation espeak method:
sudo apt-get install espeak
After the installation is completed, we can copy the following code to your golang editor and save it as a .go file:
package main import ( "fmt" "os/exec" "time" "github.com/mkb218/gosynth/v2/synth" ) func main() { // 调用espeak命令将文本转为音频文件 err := exec.Command("espeak", "-w", "test.wav", "Hello, World!").Run() if err != nil { fmt.Println("Failed to convert text to wave file:", err) return } // 播放音频文件 player := synth.NewWAVFilePlayer("test.wav") player.Play() time.Sleep(player.Duration()) }
Before running the above code, you need to ensure that the folder where the test.wav file is located has been created. In this code, we call the espeak command through the Command function in the exec package to convert the text into an audio file. At the same time, we use the NewWAVFilePlayer function in the synth package of the go-astits library to play the test.wav audio file.
In addition to installing the speech engine locally, we can also implement the text-to-speech function by calling the third-party speech API. Commonly used voice APIs include those provided by cloud service providers such as Alibaba Cloud and Tencent Cloud.
Here, we choose to use Baidu speech synthesis API. To use Baidu speech synthesis API, you need to go to https://ai.baidu.com/tech/speech/tts to register and apply for related applications, and apply for the App ID, API Key and Secret Key that can access the API.
Install related golang third-party libraries:
go get github.com/go-resty/resty/v2 go get github.com/leonkaihao/baidu-tts-go/baidu
Write code to interact with Baidu speech synthesis API:
package main import ( "fmt" "github.com/go-resty/resty/v2" "github.com/leonkaihao/baidu-tts-go/baidu" ) func main() { // 获取Access Token client := resty.New() resp, err := client.R(). SetFormData(map[string]string{ "grant_type": "client_credentials", "client_id": "您的API Key", "client_secret": "您的Secret Key", }). Post("https://aip.baidubce.com/oauth/2.0/token") if err != nil { fmt.Println("Failed to get token: ", err) return } token := baidu.Token{} err = resp.UnmarshalJSON(&token) if err != nil { fmt.Println("Failed to unmarshal token response: ", err) return } // 调用语音合成API来将文本转为语音 resp, err = client.R(). SetHeader("Content-Type", "application/json"). SetQueryParam("access_token", token.AccessToken). SetBody(map[string]string{ "tex": "你好,欢迎使用百度语音合成API", "lan": "zh", "ctp": "1", "speed": "5", "per": "4", "cuid": "YOUR_CUID", "spd": "5", "vol": "15", "tts": "audio", "aue": "3", "channel": "1", "len": "-1", "pdt": "", "pvc": "1.0", "speaker": "0", "background_music_id": -1, }). Post("https://tsn.baidu.com/text2audio") if err != nil { fmt.Println("Failed to request API: ", err) return } fmt.Println(resp.StatusCode()) }
Note that in the above code, you need to put your API Key and Secret Key are replaced with relevant information applied on Baidu Cloud. By calling the Baidu speech synthesis API, we can easily implement the speech synthesis function without installing a local speech engine.
By using the local speech engine and calling the third-party API, we can quickly implement the text-to-speech function in golang. This article briefly introduces the basic steps of two solutions using go-astits and calling Baidu speech synthesis API. For developers interested in speech synthesis, these solutions provide you with more choices, help you quickly implement functions, and also improve your development efficiency.
The above is the detailed content of How to convert text to speech in golang. For more information, please follow other related articles on the PHP Chinese website!