


A recent project has a requirement to use js to calculate the memory occupied by a string written to localStorage. As we all know, js uses Unicode encoding. There are N types of Unicode implementations, among which UTF-8 and UTF-16 are the most commonly used. Therefore, this article only discusses these two encodings.
The following definition is taken from Wikipedia (http://zh.wikipedia.org/zh-cn/UTF-8), with some deletions.
UTF-8 (8-bit Unicode Transformation Format) is a variable-length character encoding for Unicode that can represent any character in the Unicode standard, and the first byte in its encoding is still compatible with ASCII , using one to four bytes to encode each character
The encoding rules are as follows:
Character codes between 000000 – 00007F are encoded with one byte;
Characters between 000080 – 0007FF use two bytes;
Use three bytes between 000800 – 00D7FF and 00E000 – 00FFFF. Note: Unicode does not have any characters in the range D800-DFFF;
Use 4 bytes between 010000 – 10FFFF.
UTF-16 is a fixed-length character encoding. Most characters use two bytes to encode, and character codes exceeding 65535 use four bytes, as follows:
000000 – 00FFFF two bytes;
010000 – 10FFFF four bytes.
At first I thought that since the page is encoded in UTF-8, the strings stored in localStorage should also be encoded in UTF-8. But later tests found that the calculated size was less than 5MB, but an exception was thrown when saving it to localStorage. After thinking about it, the encoding of the page can be changed. If localStorage stores strings according to the encoding of the page, wouldn't it be a mess? Browsers should all use UTF-16 encoding. The 5MB string was calculated using UTF-16 encoding, and it was successfully written. If it exceeds, it fails.
Okay, here’s the code implementation. The calculation rules are as written above. For calculation speed, the two for loops are written separately.
/** * 计算字符串所占的内存字节数,默认使用UTF-8的编码方式计算,也可制定为UTF-16 * UTF-8 是一种可变长度的 Unicode 编码格式,使用一至四个字节为每个字符编码 * * 000000 - 00007F(128个代码) 0zzzzzzz(00-7F) 一个字节 * 000080 - 0007FF(1920个代码) 110yyyyy(C0-DF) 10zzzzzz(80-BF) 两个字节 * 000800 - 00D7FF 00E000 - 00FFFF(61440个代码) 1110xxxx(E0-EF) 10yyyyyy 10zzzzzz 三个字节 * 010000 - 10FFFF(1048576个代码) 11110www(F0-F7) 10xxxxxx 10yyyyyy 10zzzzzz 四个字节 * * 注: Unicode在范围 D800-DFFF 中不存在任何字符 * {@link http://zh.wikipedia.org/wiki/UTF-8} * * UTF-16 大部分使用两个字节编码,编码超出 65535 的使用四个字节 * 000000 - 00FFFF 两个字节 * 010000 - 10FFFF 四个字节 * * {@link http://zh.wikipedia.org/wiki/UTF-16} * @param {String} str * @param {String} charset utf-8, utf-16 * @return {Number} */ var sizeof = function(str, charset){ var total = 0, charCode, i, len; charset = charset ? charset.toLowerCase() : ''; if(charset === 'utf-16' || charset === 'utf16'){ for(i = 0, len = str.length; i < len; i++){ charCode = str.charCodeAt(i); if(charCode <= 0xffff){ total += 2; }else{ total += 4; } } }else{ for(i = 0, len = str.length; i < len; i++){ charCode = str.charCodeAt(i); if(charCode <= 0x007f) { total += 1; }else if(charCode <= 0x07ff){ total += 2; }else if(charCode <= 0xffff){ total += 3; }else{ total += 4; } } } return total; }

Detailed explanation of JavaScript string replacement method and FAQ This article will explore two ways to replace string characters in JavaScript: internal JavaScript code and internal HTML for web pages. Replace string inside JavaScript code The most direct way is to use the replace() method: str = str.replace("find","replace"); This method replaces only the first match. To replace all matches, use a regular expression and add the global flag g: str = str.replace(/fi

Leverage jQuery for Effortless Web Page Layouts: 8 Essential Plugins jQuery simplifies web page layout significantly. This article highlights eight powerful jQuery plugins that streamline the process, particularly useful for manual website creation

So here you are, ready to learn all about this thing called AJAX. But, what exactly is it? The term AJAX refers to a loose grouping of technologies that are used to create dynamic, interactive web content. The term AJAX, originally coined by Jesse J

This post compiles helpful cheat sheets, reference guides, quick recipes, and code snippets for Android, Blackberry, and iPhone app development. No developer should be without them! Touch Gesture Reference Guide (PDF) A valuable resource for desig

jQuery is a great JavaScript framework. However, as with any library, sometimes it’s necessary to get under the hood to discover what’s going on. Perhaps it’s because you’re tracing a bug or are just curious about how jQuery achieves a particular UI

10 fun jQuery game plugins to make your website more attractive and enhance user stickiness! While Flash is still the best software for developing casual web games, jQuery can also create surprising effects, and while not comparable to pure action Flash games, in some cases you can also have unexpected fun in your browser. jQuery tic toe game The "Hello world" of game programming now has a jQuery version. Source code jQuery Crazy Word Composition Game This is a fill-in-the-blank game, and it can produce some weird results due to not knowing the context of the word. Source code jQuery mine sweeping game

Article discusses creating, publishing, and maintaining JavaScript libraries, focusing on planning, development, testing, documentation, and promotion strategies.

This tutorial demonstrates how to create a captivating parallax background effect using jQuery. We'll build a header banner with layered images that create a stunning visual depth. The updated plugin works with jQuery 1.6.4 and later. Download the


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SublimeText3 English version
Recommended: Win version, supports code prompts!

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft
