开发过程中遇到一种奇怪的编码格式:
每日一色|蓝白~
使用decode/unescape/decodeURI解码均无效.研究一番,总结一下.
实际上上面这种奇怪的编码格式并不是编码,而是一种叫做 NCR(Numeric Character Reference) 的标记结构.
Numeric Character Reference
看看维基百科的解释:
A numeric character reference (NCR) is a common markup construct used in SGML and other SGML-related markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a single character from the Universal Charact
NCR是一种常见的标记结构,用于SGML和其他SGML相似的标记语言,如HTML和XML。它由一个短的字符序列组成,代表一个字符(全球的文字字符)。
NCR编码是由一个与号(&)跟着一个井号(#), 然后跟着这个字符的Unicode编码值, 最后跟着一个分号组成的, 如:
&#dddd;&#xhhhh;&#name;
其中, dddd是字符编码的十进制表示, 而hhhh是字符的16进制表示.
以 HTML 为例,这三种转义序列都称作 character reference:
前两种是 numeric character reference(NCR),数字取值为目标字符的 Unicode code point;以「」开头的后接十进制数字,以「」开头的后接十六进制数字。
后一种是 character entity reference,后接预先定义的 entity 名称,而 entity 声明了自身指代的字符。
从 HTML 4 开始,NCR 以 Unicode 为准,与文档编码无关。
「中国」二字分别是 Unicode 字符 U+4E2D 和 U+56FD,十六进制表示的 code point 数值「4E2D」和「56FD」就是十进制的「20013」和「22269」。所以——
中国中国
——这两种 NCR 写法都会在显示时转换为「中国」二字。
如何将 NCR 字符转换成真实字符
方法如下:
var regex_num_set = /&#(/d+);/g;var str = "Here is some text: 每日一色|蓝白~"str = str.replace(regex_num_set, function(_, $1) { return String.fromCharCode($1);});document.write('<pre class="brush:php;toolbar:false">'+JSON.stringify(str,0,3));
以上例子使用了 String.prototype.replace() 和 String.fromCharCode() 方法. 思路为将字符串中的 NCR 字符逐个获取到 “”和”;”间的 Unicode 字符编码值, 然后利用 String.fromCharCode() 方法, 将 Unicode 编码转为真实字符.
博客文章地址: http://joebon.cc/convert-numeric-chracter-reference-to-actual-character
参考资料
-
开头的是什么编码呢。浏览器可以解释它。如中国等同与中文”中国”?
-
Converting numeric character reference to actual character
-
String.prototype.replace()
-
[字符编码]Numeric Character Reference和HTML Entities(一)

The official account web page update cache, this thing is simple and simple, and it is complicated enough to drink a pot of it. You worked hard to update the official account article, but the user still opened the old version. Who can bear the taste? In this article, let’s take a look at the twists and turns behind this and how to solve this problem gracefully. After reading it, you can easily deal with various caching problems, allowing your users to always experience the freshest content. Let’s talk about the basics first. To put it bluntly, in order to improve access speed, the browser or server stores some static resources (such as pictures, CSS, JS) or page content. Next time you access it, you can directly retrieve it from the cache without having to download it again, and it is naturally fast. But this thing is also a double-edged sword. The new version is online,

The article discusses using HTML5 form validation attributes like required, pattern, min, max, and length limits to validate user input directly in the browser.

This article demonstrates efficient PNG border addition to webpages using CSS. It argues that CSS offers superior performance compared to JavaScript or libraries, detailing how to adjust border width, style, and color for subtle or prominent effect

Article discusses best practices for ensuring HTML5 cross-browser compatibility, focusing on feature detection, progressive enhancement, and testing methods.

The article discusses the HTML <datalist> element, which enhances forms by providing autocomplete suggestions, improving user experience and reducing errors.Character count: 159

The article discusses the HTML <meter> element, used for displaying scalar or fractional values within a range, and its common applications in web development. It differentiates <meter> from <progress> and ex

This article explains the HTML5 <time> element for semantic date/time representation. It emphasizes the importance of the datetime attribute for machine readability (ISO 8601 format) alongside human-readable text, boosting accessibilit

The article discusses the HTML <progress> element, its purpose, styling, and differences from the <meter> element. The main focus is on using <progress> for task completion and <meter> for stati


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),
