Home >Backend Development >Golang >Is there a character type in go language?

Is there a character type in go language?

青灯夜游
青灯夜游Original
2021-06-04 17:08:242119browse

There are two character types in the go language: 1. byte type, also called uint8 type, which represents a character of ASCII code; 2. rune type, which represents a UTF-8 character. When you need to process Chinese, For Japanese or other compound characters, you need to use the rune type. The rune type is equivalent to the int32 type.

Is there a character type in go language?

The operating environment of this tutorial: Windows 10 system, GO 1.11.2, Dell G3 computer.

Each element in a string is called a "character", and characters can be obtained when traversing or obtaining a single string element.

There are two character types in Go language:

  • One is the uint8 type, or byte type, which represents the ASCII code a character.

  • The other is the rune type, which represents a UTF-8 character. When you need to process Chinese, Japanese or other compound characters, you need to use the rune type. The rune type is equivalent to the int32 type.

byte type is an alias of uint8. There is no problem at all for traditional ASCII encoded characters that only occupy 1 byte. For example, var ch byte = 'A', the character uses Enclosed in single quotes.

In the ASCII code table, the value of A is 65, and in hexadecimal notation it is 41, so the following writing is equivalent:

var ch byte = 65 或 var ch byte = '\x41'      //(\x 总是紧跟着长度为 2 的 16 进制数)

Another possible writing is \ is followed by an octal number of length 3, for example \377.

Go language also supports Unicode (UTF-8), so characters are also called Unicode code points or runes, and are represented by int in memory. In documents, the format U hhhh is generally used, where h represents a hexadecimal number.

When writing Unicode characters, you need to add the prefix \u or \U before the hexadecimal number. Because Unicode occupies at least 2 bytes, we use int16 or int type to represent it. If you need to use 4 bytes, use the \u prefix, if you need to use 8 bytes, use the \U prefix.

var ch int = '\u0041'
var ch2 int = '\u03B2'
var ch3 int = '\U00101234'
fmt.Printf("%d - %d - %d\n", ch, ch2, ch3) // integer
fmt.Printf("%c - %c - %c\n", ch, ch2, ch3) // character
fmt.Printf("%X - %X - %X\n", ch, ch2, ch3) // UTF-8 bytes
fmt.Printf("%U - %U - %U", ch, ch2, ch3)   // UTF-8 code point

Output:

65 - 946 - 1053236
A - β - r
41 - 3B2 - 101234
U+0041 - U+03B2 - U+101234

The format specifier %c is used to represent characters. When used with characters, %v or %d will output the integer used to represent the character, % U outputs a string in the format of U hhhh.

The Unicode package has some built-in functions for testing characters. The return value of these functions is a Boolean value, as shown below (where ch represents the character):

  • Judge whether it is a letter: unicode.IsLetter(ch)

  • Judge whether it is a number: unicode.IsDigit(ch)

  • Judge whether it is a number For white space symbols: unicode.IsSpace(ch)

Recommended learning: Golang tutorial

The above is the detailed content of Is there a character type in go language?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn