Home  >  Article  >  Backend Development  >  golang encoding conversion

golang encoding conversion

WBOY
WBOYOriginal
2023-05-21 19:48:371277browse

Go language is a modern programming language with efficiency, concurrency and portability. In practical applications, it is often necessary to deal with conversion problems between different encodings. This article will introduce the encoding conversion solution in golang.

  1. Basic knowledge of encoding

In computers, characters are often represented as digital encodings, such as ASCII code, GB2312, UTF-8, etc. The character mapping relationships between different encodings are different, which also leads to their respective shortcomings and advantages.

ASCII code is a commonly used encoding method. It can only represent 128 characters, including uppercase and lowercase letters, numbers and some special characters, which limits its application in internationalization. GB2312 is a Chinese character encoding method that can represent approximately 7,000 Chinese characters, but it is only popular within China. Similarly, UTF-8 encoding is also a commonly used encoding method. It can represent characters worldwide, but its disadvantage is that when dealing with East Asian languages, its number of bytes will be more compared to GB2312 or GB18030.

Therefore, it is necessary to convert between different encodings in order to be used in the corresponding scenarios.

  1. Encoding conversion in golang

Golang’s standard library provides encoding and unicode packages, which are used to deal with encoding-related issues and Unicode code point-related issues respectively. question.

In golang, characters are represented as rune types, and strings are composed of a series of rune types. The following will introduce commonly used encoding conversion functions and examples in golang.

2.1 Encoding conversion function in golang

2.1.1 bytes package

The bytes package provides some functions for reading and writing binary data, some of which are specialized Used to parse and serialize strings.

Function name function

func ToUpperSpecial Converts the string to uppercase, supports custom Locale

func ToLowerSpecial Converts the string to lowercase, supports custom Locale

func ToTitleSpecial Convert the string to title format, support custom Locale

func ToUpper Convert the string to uppercase

func ToLower Convert the string to lowercase

func ToTitle Convert the string to title format

func Title Convert the entire string to title format

func TrimSpace Remove the spaces at the beginning and end of the string

func Trim Remove characters Specified characters at the beginning and end of the string

func TrimFunc Removes the specified function at the beginning and end of the string

func TrimLeftFunc Removes the specified function on the left side of the string

func TrimRightFunc Removes the specified function on the right side of the string

func HasPrefix Determines whether the string contains the specified prefix

func HasSuffix Determines whether the string contains the specified suffix

func Index Returns the first occurrence of the specified substring in the string Position

func LastIndex Returns the position of the last occurrence of the specified substring in the string

func IndexFunc Returns the position of the first occurrence of a character that meets the specified condition in the string

func LastIndexFunc Returns the position of the last occurrence of a character that meets the specified condition in the string

func IndexByte Returns the position of the first occurrence of the specified character in the string

func LastIndexByte Returns the last occurrence of the character in the string The position where the specified character appears

func Count Returns the number of times the specified substring appears in the string

func Replace Replaces the specified substring in the string with another string

func ReplaceAll Replaces all specified substrings in the string with another string

func Split Splits the string into slices according to the specified delimiter

func SplitN Splits the string according to the specified delimiter Into slices, up to N times

func SplitAfter Splits the string into slices according to the specified suffix, and the suffix is ​​included in each substring

func SplitAfterN Splits the string into slices according to the specified suffix , the suffix is ​​included in each substring, split up to N times

func Join Merges the string slices into one string according to the specified delimiter

2.1.2 encoding package

encoding package provides a series of functions for encoding and decoding different character encoding methods, such as UTF-8, GB2312, etc.

Function name function

func Decode Decodes the byte slice of the specified encoding into a rune slice in UTF-8 format

func DecodeRune Decodes the byte slice of the specified encoding into Single rune

func DecodeLastRune Decode the last rune from the byte slice of the specified encoding

func Encode Convert the rune slice to the byte slice of the specified encoding

func RuneCount calculation The number of runes in the rune slice

func Runes Decodes the specified encoding byte slice into a rune slice

2.1.3 unicode package

The unicode package provides some functions Used to determine whether a character is a number, letter, etc.

Function name Function

func IsDigit Determines whether the character is a number

func IsLetter Determines whether the character is a letter

func IsLower Определяет, является ли символ строчной буквой

func IsUpper Определяет, является ли символ прописной буквой

func IsPunct Определяет, является ли символ знаком препинания

func IsGraphic Определяет, является ли символ прописной буквой. Для визуальных графических символов

2.2 Примеры преобразования кодировки в golang

Ниже приведены некоторые примеры преобразования кодировки в golang:

2.2 .1 Преобразование кодировки UTF-8 в GB2312

Пример 1. Используйте пакет кодирования golang для преобразования между кодировкой UTF-8 и кодировкой GB2312.

package main

import (
    "fmt"
    "github.com/axgle/mahonia"
)

func main() {
    str := "你好,世界!"
    enc := mahonia.NewEncoder("GB2312")
    newStr := enc.ConvertString(str)
    fmt.Println(newStr)
}

2.2.2 Преобразование кодировки GB2312 в UTF-8

Пример 2. Используйте пакет кодирования golang для преобразования кодировки GB2312 в кодировку UTF-8.

package main

import (
    "fmt"
    "github.com/axgle/mahonia"
)

func main() {
    str := "你好,世界!"
    dec := mahonia.NewDecoder("GB2312")
    newStr := dec.ConvertString(str)
    fmt.Println(newStr)
}
  1. Резюме

В практических приложениях проблема преобразования кодировки является распространенной проблемой. В этой статье представлено решение для преобразования кодировок в golang, которое в основном использует функции, предоставляемые пакетами кодирования и unicode, для преобразования между различными кодировками. Изучив это содержание, мы должны иметь более глубокое понимание и более точные практические навыки кодирования операций преобразования в golang.

The above is the detailed content of golang encoding conversion. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Previous article:golang memory increaseNext article:golang memory increase