Home > Article > Web Front-end > 4 types of JavaScript memory leaks and how to avoid them
Original text: 4 Types of Memory Leaks in JavaScript and How to Get Rid Of Them
Translation from: Alon's Blog
This article will explore common client-side JavaScript memory leaks, And how to use Chrome Dev Tools to find problems.
Introduction
Memory leaks are a problem that every developer will eventually face, and they are the source of many problems: sluggish response, crashes, high latency, and other application problems .
What is a memory leak?
Essentially, a memory leak can be defined as: when the application no longer needs to occupy the memory, for some reason, the memory is not reclaimed by the operating system or the available memory pool. Programming languages vary in how they manage memory. Only developers know best which memory is no longer needed and can be reclaimed by the operating system. Some programming languages provide language features that help developers do this kind of thing. Others rely on developers to be clear about whether memory is needed.
JavaScript Memory Management
JavaScript is a garbage collected language. Garbage collection languages help developers manage memory by periodically checking whether previously allocated memory is reachable. In other words, garbage-collected languages alleviate the "memory is still available" and "memory is still reachable" problems. The difference between the two is subtle but important: only the developer knows which memory will still be used in the future, while unreachable memory is algorithmically determined and marked, and is promptly reclaimed by the operating system.
JavaScript Memory Leak
The main cause of memory leaks in garbage collected languages is unwanted references. Before understanding it, you need to understand how the garbage collection language distinguishes between reachable and unreachable memory.
Mark-and-sweep
The algorithm used by most garbage collection languages is called Mark-and-sweep. The algorithm consists of the following steps:
1. The garbage collector creates a "roots" list. Roots are usually references to global variables in your code. In JavaScript, the "window" object is a global variable and is treated as root. The window object always exists, so the garbage collector can check whether it and all its child objects exist (i.e., are not garbage);
2. All roots are checked and marked as active (i.e., are not garbage). All sub-objects are also checked recursively. All objects starting from root are not considered garbage if they are reachable.
3. All unmarked memory will be treated as garbage. The collector can now release the memory and return it to the operating system.
Modern garbage collectors have improved algorithms, but the essence is the same: reachable memory is marked, and the rest is garbage collected.
Unnecessary reference means that the developer knows that the memory reference is no longer needed, but for some reason, it is still left in the active root tree. In JavaScript, an unwanted reference is a variable that remains in the code and is no longer needed but points to a piece of memory that should be freed. Some people think this is a developer error.
In order to understand the most common memory leaks in JavaScript, we need to understand how references are easily forgotten.
Three types of common JavaScript memory leaks
1: Unexpected global variables
JavaScript handles undefined variables in a loose way: undefined The variable will create a new variable in the global object. In browsers, the global object is window .
function foo(arg) { bar = "this is a hidden global variable"; }
The truth is:
function foo(arg) { window.bar = "this is an explicit global variable"; }
Forgot to use var internally in function foo and accidentally created a global variable. This example leaks a simple string, which is innocuous, but there are worse things.
Another unexpected global variable may be created by this:
function foo() { this.variable = "potential accidental global"; } // Foo 调用自己,this 指向了全局对象(window) // 而不是 undefined foo();
Quote
In JavaScript file Adding 'use strict' to the header can avoid such errors. Enable strict mode parsing of JavaScript to avoid unexpected global variables.
Global Variable Notes
Although we discussed some unexpected global variables, there are still some explicit global variables that generate garbage. They are defined as non-recyclable (unless defined as empty or reallocated). Care needs to be taken especially when global variables are used to temporarily store and process large amounts of information. If you must use a global variable to store a large amount of data, be sure to set it to null or redefine it after use. One major cause of increased memory consumption related to global variables is caching. Caching data is for reuse, and the cache must have an upper limit on its size to be useful. High memory consumption causes the cache to breach its upper limit because the cached content cannot be reclaimed.
2: The forgotten timer or callback function
Using setInterval in JavaScript is very common. A common piece of code:
var someResource = getData(); setInterval(function() { var node = document.getElementById('Node'); if(node) { // 处理 node 和 someResource node.innerHTML = JSON.stringify(someResource)); } }, 1000);
此例说明了什么:与节点或数据关联的计时器不再需要,node 对象可以删除,整个回调函数也不需要了。可是,计时器回调函数仍然没被回收(计时器停止才会被回收)。同时,someResource 如果存储了大量的数据,也是无法被回收的。
对于观察者的例子,一旦它们不再需要(或者关联的对象变成不可达),明确地移除它们非常重要。老的 IE 6 是无法处理循环引用的。如今,即使没有明确移除它们,一旦观察者对象变成不可达,大部分浏览器是可以回收观察者处理函数的。
观察者代码示例:
var element = document.getElementById('button'); function onClick(event) { element.innerHTML = 'text'; } element.addEventListener('click', onClick);
对象观察者和循环引用注意事项
老版本的 IE 是无法检测 DOM 节点与 JavaScript 代码之间的循环引用,会导致内存泄露。如今,现代的浏览器(包括 IE 和 Microsoft Edge)使用了更先进的垃圾回收算法,已经可以正确检测和处理循环引用了。换言之,回收节点内存时,不必非要调用 removeEventListener 了。
3:脱离 DOM 的引用
有时,保存 DOM 节点内部数据结构很有用。假如你想快速更新表格的几行内容,把每一行 DOM 存成字典(JSON 键值对)或者数组很有意义。此时,同样的 DOM 元素存在两个引用:一个在 DOM 树中,另一个在字典中。将来你决定删除这些行时,需要把两个引用都清除。
var elements = { button: document.getElementById('button'), image: document.getElementById('image'), text: document.getElementById('text') }; function doStuff() { image.src = 'http://some.url/image'; button.click(); console.log(text.innerHTML); // 更多逻辑 } function removeButton() { // 按钮是 body 的后代元素 document.body.removeChild(document.getElementById('button')); // 此时,仍旧存在一个全局的 #button 的引用 // elements 字典。button 元素仍旧在内存中,不能被 GC 回收。 }
此外还要考虑 DOM 树内部或子节点的引用问题。假如你的 JavaScript 代码中保存了表格某一个 b6c5a531a458a2e790c1fd6421739d1c 的引用。将来决定删除整个表格的时候,直觉认为 GC 会回收除了已保存的 b6c5a531a458a2e790c1fd6421739d1c 以外的其它节点。实际情况并非如此:此 b6c5a531a458a2e790c1fd6421739d1c 是表格的子节点,子元素与父元素是引用关系。由于代码保留了 b6c5a531a458a2e790c1fd6421739d1c 的引用,导致整个表格仍待在内存中。保存 DOM 元素引用的时候,要小心谨慎。
4:闭包
闭包是JavaScript开发的一个关键方面:匿名函数可以访问父级作用域的变量。
代码示例:
var theThing = null; var replaceThing = function () { var originalThing = theThing; var unused = function () { if (originalThing) console.log("hi"); }; theThing = { longStr: new Array(1000000).join('*'), someMethod: function () { console.log(someMessage); } }; }; setInterval(replaceThing, 1000);
代码片段做了一件事情:每次调用 replaceThing ,theThing 得到一个包含一个大数组和一个新闭包(someMethod)的新对象。同时,变量 unused 是一个引用 originalThing 的闭包(先前的 replaceThing 又调用了 theThing )。思绪混乱了吗?最重要的事情是,闭包的作用域一旦创建,它们有同样的父级作用域,作用域是共享的。someMethod 可以通过 theThing 使用,someMethod 与 unused 分享闭包作用域,尽管 unused 从未使用,它引用的 originalThing 迫使它保留在内存中(防止被回收)。当这段代码反复运行,就会看到内存占用不断上升,垃圾回收器(GC)并无法降低内存占用。本质上,闭包的链表已经创建,每一个闭包作用域携带一个指向大数组的间接的引用,造成严重的内存泄露。
引用
Meteor的博文解释了如何修复此种问题。在 replaceThing 的最后添加 originalThing = null 。
Chrome 内存剖析工具概览
Chrome 提供了一套很棒的检测 JavaScript 内存占用的工具。与内存相关的两个重要的工具:timeline 和 profiles。
timeline 可以检测代码中不需要的内存。在此截图中,我们可以看到潜在的泄露对象稳定的增长,数据采集快结束时,内存占用明显高于采集初期,Node(节点)的总量也很高。种种迹象表明,代码中存在 DOM 节点泄露的情况。
Profiles
Profiles 是你可以花费大量时间关注的工具,它可以保存快照,对比 JavaScript 代码内存使用的不同快照,也可以记录时间分配。每一次结果包含不同类型的列表,与内存泄露相关的有 summary(概要) 列表和 comparison(对照) 列表。
summary(概要) 列表展示了不同类型对象的分配及合计大小:shallow size(特定类型的所有对象的总大小),retained size(shallow size 加上其它与此关联的对象大小)。它还提供了一个概念,一个对象与关联的 GC root 的距离。
对比不同的快照的 comparison list 可以发现内存泄露。
实例:使用Chrome发现内存泄露
实质上有两种类型的泄露:周期性的内存增长导致的泄露,以及偶现的内存泄露。显而易见,周期性的内存泄露很容易发现;偶现的泄露比较棘手,一般容易被忽视,偶尔发生一次可能被认为是优化问题,周期性发生的则被认为是必须解决的 bug。
以Chrome文档中的代码为例:
var x = []; function createSomeNodes() { var p, i = 100, frag = document.createDocumentFragment(); for (;i > 0; i--) { p = document.createElement("p"); p.appendChild(document.createTextNode(i + " - "+ new Date().toTimeString())); frag.appendChild(p); } document.getElementById("nodes").appendChild(frag); } function grow() { x.push(new Array(1000000).join('x')); createSomeNodes(); setTimeout(grow,1000); }
当 grow 执行的时候,开始创建 p 节点并插入到 DOM 中,并且给全局变量分配一个巨大的数组。通过以上提到的工具可以检测到内存稳定上升。
找出周期性增长的内存
timeline 标签擅长做这些。在 Chrome 中打开例子,打开 Dev Tools ,切换到 timeline,勾选 memory 并点击记录按钮,然后点击页面上的 The Button 按钮。过一阵停止记录看结果:
两种迹象显示出现了内存泄露,图中的 Nodes(绿线)和 JS heap(蓝线)。Nodes 稳定增长,并未下降,这是个显著的信号。
JS heap 的内存占用也是稳定增长。由于垃圾收集器的影响,并不那么容易发现。图中显示内存占用忽涨忽跌,实际上每一次下跌之后,JS heap 的大小都比原先大了。换言之,尽管垃圾收集器不断的收集内存,内存还是周期性的泄露了。
确定存在内存泄露之后,我们找找根源所在。
保存两个快照
切换到 Chrome Dev Tools 的 profiles 标签,刷新页面,等页面刷新完成之后,点击 Take Heap Snapshot 保存快照作为基准。而后再次点击 The Button 按钮,等数秒以后,保存第二个快照。
筛选菜单选择 Summary,右侧选择 Objects allocated between Snapshot 1 and Snapshot 2,或者筛选菜单选择 Comparison ,然后可以看到一个对比列表。
此例很容易找到内存泄露,看下 (string) 的 Size Delta Constructor,8MB,58个新对象。新对象被分配,但是没有释放,占用了8MB。
如果展开 (string) Constructor,会看到许多单独的内存分配。选择某一个单独的分配,下面的 retainers 会吸引我们的注意。
我们已选择的分配是数组的一部分,数组关联到 window 对象的 x 变量。这里展示了从巨大对象到无法回收的 root(window)的完整路径。我们已经找到了潜在的泄露以及它的出处。
我们的例子还算简单,只泄露了少量的 DOM 节点,利用以上提到的快照很容易发现。对于更大型的网站,Chrome 还提供了 Record Heap Allocations 功能。
Record heap allocations 找内存泄露
回到Chrome Dev Tools 的 profiles 标签,点击 Record Heap Allocations。工具运行的时候,注意顶部的蓝条,代表了内存分配,每一秒有大量的内存分配。运行几秒以后停止。
上图中可以看到工具的杀手锏:选择某一条时间线,可以看到这个时间段的内存分配情况。尽可能选择接近峰值的时间线,下面的列表仅显示了三种 constructor:其一是泄露最严重的(string),下一个是关联的 DOM 分配,最后一个是 Text constructor(DOM 叶子节点包含的文本)。
从列表中选择一个 HTMLpElement constructor,然后选择 Allocation stack。
Now you know where the elements are allocated (grow -> createSomeNodes). Take a closer look at the timeline in the picture and find that the HTMLpElement constructor has been called many times, which means that the memory has been occupied and cannot be GC Recycling, we know the exact location where these objects were allocated (createSomeNodes). Back to the code itself, let's discuss how to fix the memory leak.
Another useful feature
In the heap allocations results area, select Allocation.
This view presents a list of functions related to memory allocation, and we immediately see grow and createSomeNodes. When grow is selected, looking at the relevant object constructor, it is clear that (string), HTMLpElement and Text are leaked.
Combined with the tools mentioned above, you can easily find memory leaks.
Extended reading
Memory Management - Mozilla Developer Network
JScript Memory Leaks - Douglas Crockford (old, in relation to Internet Explorer 6 leaks)
JavaScript Memory Profiling - Chrome Developer Docs
Memory Diagnosis - Google Developers
An Interesting Kind of JavaScript Memory Leak - Meteor blog
Grokking V8 closures
above It’s about the 4 types of JavaScript memory leaks and how to avoid them. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!