Home  >  Article  >  Backend Development  >  Golang web application security: Should you check if the input is valid utf-8?

Golang web application security: Should you check if the input is valid utf-8?

WBOY
WBOYforward
2024-02-10 08:10:08638browse

Golang Web 应用程序安全:您应该检查输入是否有效 utf-8?

php editor Xiaoxin will introduce you to an important aspect of Golang web application security in this article: checking whether the input is valid utf-8. Input validation is critical in web application development because malicious users may submit input that contains malicious code or illegal characters. Especially for programming languages ​​like Golang, correctly handling and validating the UTF-8 encoding of input is an important part of ensuring application security. In this article, we'll look at how to efficiently check if your input is valid UTF-8, and provide some practical advice and tips.

Question content

According to several best practice documents, it is best to check if the input data is utf-8.

In my project, I use gin and use go-playground/validator for validation. There is an "ascii" validator but no "utf-8" validator.

I found https://pkg.go.dev/unicode/utf8#validstring and I was wondering if using it to check the input would be of any help or is it given since go itself uses unicode internally?

This is an example:

package main

import (
    "net/http"

    "github.com/gin-gonic/gin"
)

type User struct {
    Name string `json:"name" binding:"required,alphanum"`
}

func main() {
    r := gin.Default()
    r.POST("/user", createUserHandler)
    r.Run()
}

func createUserHandler(c *gin.Context) {
    var newUser User
    err := c.ShouldBindJSON(&newUser)

    if err != nil {
        c.AbortWithError(http.StatusBadRequest, err)
        return
    }

    c.Status(http.StatusCreated)
}

After calling c.shouldbindjson, do you ensure that the name in newuser is utf-8 encoded? Is there any benefit to using utf8.validstring to check name?

Workaround

Gin uses the standard encoding/json package to unmarshal JSON documents. Documentation description of this package:

Invalid UTF-8 or invalid UTF-16 surrogate pairs are not treated as errors when unmarshalling quoted strings. Instead, they are replaced by the Unicode replacement character U FFFD.

Ensure that the decoded string value is valid UTF-8. There is no advantage to using utf8.ValidString to check a string value.

Depending on application requirements, you may need to check and handle the Unicode replacement character "�". Aside: As indicated by � in this answer, SO treats Unicode replacement characters like any other character.

Go itself uses Unicode internally? ​​p>

Some language features use UTF-8 encoding (string ranges, []runes, and conversions between strings), but these features do not limit the bytes that can be stored in a string. Strings can contain any byte sequence, including invalid UTF-8.

The above is the detailed content of Golang web application security: Should you check if the input is valid utf-8?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete