Home >Backend Development >Golang >How to get the character length of rune in go language

How to get the character length of rune in go language

藏色散人
藏色散人forward
2021-06-09 15:05:132832browse

The following tutorial column from golang will introduce to you how to get the character length of rune in the go language. I hope it will be helpful to friends in need!

How to get the character length of rune in go language

Rune is a special data type in the Go language. It is an alias of int32 and is equivalent to int32 in almost all aspects. It is used to distinguish character values ​​and integer values. The official explanation is as follows:

// rune is an alias for int32 and is equivalent to int32 in all ways. It is
// used, by convention, to distinguish character values from integer values.

//int32的别名,几乎在所有方面等同于int32
//它用来区分字符值和整数值
type rune = int32

Let’s take a look at an example:

package mainimport "fmt"func main() {
    var str = "hello 你好啊"
    fmt.Println("len(str):", len(str))}

Let’s guess the result, hello has 5 characters, 1 space, and 3 Chinese characters. It should be 9 in total, and the length is 9, but let’s execute it ,

How to get the character length of rune in go language

The result printed is 15. Why is this?

The bottom layer of string in golang is implemented through byte array. Chinese characters occupy 2 bytes under unicode and 3 bytes under utf-8 encoding, and golang's default encoding is exactly utf-8.

So the calculated length is equal to 5 1 3*3=15
If we need to calculate the length of the string instead of the number of underlying bytes, we can use the following Method:

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    var str = "hello 你好啊"
    //golang中string底层是通过byte数组实现的,座椅直接求len 实际是在按字节长度计算  所以一个汉字占3个字节算了3个长度
    fmt.Println("len(str):", len(str)) // 15
    //以下两种都可以得到str的字符串长度

    //1、golang中的unicode/utf8包提供了用utf-8获取长度的方法
    fmt.Println("RuneCountInString:", utf8.RuneCountInString(str))

    //2、通过rune类型处理unicode字符
    fmt.Println("rune:", len([]rune(str)))
}

The running result is as follows:
How to get the character length of rune in go language

There is another one above the rune definition, byte = uint8

// byte is an alias for uint8 and is equivalent to uint8 in all ways. It is
// used, by convention, to distinguish byte values from 8-bit unsigned
// integer values.
type byte = uint8
  • byte is equivalent to int8 , commonly used to process ascii characters
  • rune is equivalent to int32, commonly used to process unicode or utf-8 characters

The above is the detailed content of How to get the character length of rune in go language. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:learnku.com. If there is any infringement, please contact admin@php.cn delete