Home >Backend Development >Golang >Go Strings: Rune vs. Byte: What's the Difference When Ranging and Indexing?
Range vs Byte Ranging over Strings
In Go, the behaviors of ranging over a string and accessing individual characters differ significantly. When ranging over a string, the values obtained are of type rune, representing Unicode code points. Conversely, accessing characters by index (str[index]) returns values of type byte. Understanding this distinction is crucial.
According to the Go string data type definition, a string is a sequence of bytes with an immutable length. The elements of a string can be accessed using integer indices. This is consistent with the behavior of indexing characters, where each element represents a single byte.
In contrast, the range clause in for loops allows for iterating over various data types, including strings. In the case of strings, for range iterates over the Unicode code points in the string, starting with the first byte. Each iteration yields the index of the first byte of the current code point as an integer and the code point itself as a rune. This behavior is specified in the Go programming language specification.
If you specifically want to iterate over individual bytes of a string, you can use a regular for loop with an integer index or convert the string to a byte slice ([]byte) using the []byte(s) conversion.
To summarize, ranging over strings in Go returns Unicode code points (rune type), while accessing individual characters by index returns individual bytes. This distinction is rooted in the definition of the string data type and the range clause behaviour for strings. Understanding this difference is essential for efficient manipulation of string data in Go.
The above is the detailed content of Go Strings: Rune vs. Byte: What's the Difference When Ranging and Indexing?. For more information, please follow other related articles on the PHP Chinese website!