Home >Backend Development >Golang >How to Effectively Handle Invalid UTF-8 Characters in Go Strings?

How to Effectively Handle Invalid UTF-8 Characters in Go Strings?

DDD
DDDOriginal
2024-12-07 06:44:14656browse

How to Effectively Handle Invalid UTF-8 Characters in Go Strings?

How to Handle Invalid UTF-8 Characters in a String in Go

When marshalling a list of strings using json.Marshal, you may encounter the error message "json: invalid UTF-8 in string." This occurs due to invalid UTF-8 sequences within the strings.

Replacement or Removal of Invalid Characters

In Python, you have methods to remove, replace, or raise exceptions for invalid UTF-8 characters. Go provides a similar solution:

Using strings.ToValidUTF8 (Go 1.13 )

This function removes invalid UTF-8 sequences from a string and replaces them with the Unicode replacement character (U FFFD).

fixedString := strings.ToValidUTF8("a\xc5z", "")

Mapping and Replacing with utf8.RuneError (Go 1.11 )

You can also map characters in a string using strings.Map and utf8.RuneError. If the character is a Unicode error (invalid), it is replaced with the specified fallback value (e.g., -1).

fixUtf := func(r rune) rune {
    if r == utf8.RuneError {
        return -1
    }
    return r
}

var input1 = "a\xc5z"
fmt.Println(strings.Map(fixUtf, input1)) // Output: az

var input2 = "posic�o"
fmt.Println(strings.Map(fixUtf, input2)) // Output: posico

The above is the detailed content of How to Effectively Handle Invalid UTF-8 Characters in Go Strings?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn