Home >Web Front-end >HTML Tutorial >[Jsoup Learning Etiquette] Eliminate untrusted HTML (to prevent XSS attacks)_html/css_WEB-ITnose

[Jsoup Learning Etiquette] Eliminate untrusted HTML (to prevent XSS attacks)_html/css_WEB-ITnose

WBOY
WBOYOriginal
2016-06-24 11:48:271482browse

Question

When building a website, user comments are often provided. Some malicious users will insert some scripts into the comment content, and these scripts may destroy the behavior of the entire page, or more seriously, obtain some confidential information. At this time, the HTML needs to be cleaned to avoid cross-site scripting. -site scripting attacks (XSS).

Method

Use the jsoup HTML Cleaner method for cleaning, but you need to specify a configurable Whitelist.

String unsafe =   "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";String safe = Jsoup.clean(unsafe, Whitelist.basic());// now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>

Description

XSS is also called CSS (Cross Site Script), a cross-site scripting attack. It refers to a malicious attacker inserting malicious HTML code into a Web page. When a user browses the page, the HTML code embedded in the Web will be executed, thereby achieving the special purpose of maliciously attacking the user. XSS is a passive attack. Because it is passive and difficult to exploit, many people often ignore its harm. So we often only allow users to enter plain text content, but this results in a poor user experience.

A better solution is to use a WYSIWYG rich text editor such as CKEditor and TinyMCE. These can output HTML and enable visual editing by the user. Although they can be verified on the client side, this is not secure enough. It needs to be verified on the server side and remove harmful HTML code to ensure that the HTML entered into your website is safe. Otherwise, an attacker is able to bypass client-side Javascript validation and inject insecure HMTL directly into your website.

jsoup's whitelist cleaner can filter the HTML input by the user on the server side and only output some safe tags and attributes.

jsoup provides a series of basic Whitelist configurations that can meet most requirements; but they can be modified if necessary, but be careful.

This cleaner is very easy to use. It can not only avoid XSS attacks, but also limit the range of tags that users can enter.

See

  • See the XSS cheat sheet for an example of why regular expressions cannot be used and a safe whitelist parser-based cleaner is the right choice.
  • See Cleaner to learn how to return a Document object instead of a string
  • See Whitelist to learn how to create a custom whitelist
  • Learn about the nofollow link attribute
  • Statement:
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn