Home >Backend Development >Golang >How to Efficiently Convert HTML Escape Sequences in Go?

How to Efficiently Convert HTML Escape Sequences in Go?

Susan Sarandon
Susan SarandonOriginal
2024-12-17 15:22:16695browse

How to Efficiently Convert HTML Escape Sequences in Go?

Converting Escape Characters in HTML Tags

In Go, the conversion of HTML tags containing escape characters is not as straightforward as desired. While json.Marshal() can easily convert strings with characters like "<" to its escape sequence "u003chtmlu003e," json.Unmarshal() does not provide a direct and efficient method for the reverse operation.

Using strconv.Unquote()

The strconv.Unquote() function can be employed to perform the conversion. However, it requires the string to be enclosed in quotation marks. Therefore, adding these enclosing characters manually is necessary.

import (
    "fmt"
    "strconv"
)

func main() {
    // Important to use backtick ` (raw string literal)
    // else the compiler will unquote it (interpreted string literal)!

    s := `\u003chtml\u003e`
    fmt.Println(s)
    s2, err := strconv.Unquote(`"` + s + `"`)
    if err != nil {
        panic(err)
    }
    fmt.Println(s2)
}

Output:

\u003chtml\u003e
<html></p>
<p><strong>Note:</strong></p>
<p>The html package is also available for HTML text escaping and unescaping. However, it does not decode unicode sequences of the form uxxxx, only decimal; or HH;.</p>
<pre class="brush:php;toolbar:false">import (
    "fmt"
    "html"
)

func main() {
    fmt.Println(html.UnescapeString(`\u003chtml\u003e`)) // wrong
    fmt.Println(html.UnescapeString(`&amp;#60;html&amp;#62;`))   // good
    fmt.Println(html.UnescapeString(`&amp;#x3c;html&amp;#x3e;`)) // good
}

Output:

\u003chtml\u003e
<html>
<html>

Note 2:

Remember that quoted strings using the double quote (") are interpreted strings, which are unquoted by the compiler. To specify a string with its quotes intact, use backticks to create a raw string literal.

s := "\u003chtml\u003e" // Interpreted string literal (unquoted by the compiler!)
fmt.Println(s)

s2 := `\u003chtml\u003e` // Raw string literal (no unquoting will take place)
fmt.Println(s2)

s3 := "\u003chtml\u003e" // Double quoted interpreted string literal
                           // (unquoted by the compiler to be "single" quoted)
fmt.Println(s3)

Output:

<html>
\u003chtml\u003e

The above is the detailed content of How to Efficiently Convert HTML Escape Sequences in Go?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn