Home >Backend Development >Golang >How Can I Remove Invalid UTF-8 Characters in Go?

How Can I Remove Invalid UTF-8 Characters in Go?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-14 16:10:17196browse

How Can I Remove Invalid UTF-8 Characters in Go?

Removing Invalid UTF-8 Characters in Go

When working with JSON data, it's possible to encounter invalid UTF-8 characters, leading to errors during Marshaling. This issue arises due to the presence of bytes that don't conform to the UTF-8 encoding.

Handling Invalid UTF-8 Characters

In Go, you can address this problem by removing or replacing invalid characters using various approaches:

Go 1.13

strings.ToValidUTF8("a\xc5z", "")

Go 1.11

fixUtf := func(r rune) rune {
    if r == utf8.RuneError {
        return -1
    }
    return r
}

fmt.Println(strings.Map(fixUtf, "a\xc5z"))
fmt.Println(strings.Map(fixUtf, "posic�o"))

This function removes any invalid UTF-8 characters by mapping them to a negative value, resulting in the expected output:

az
posico

The above is the detailed content of How Can I Remove Invalid UTF-8 Characters in Go?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn