Home > Article > Web Front-end > JavaScript also talks about memory optimization

JavaScript also talks about memory optimization

高洛峰Original: 2017-02-04 13:12:401346browse

Compared with C/C++, the memory processing of JavaScript we use has allowed us to pay more attention to the writing of business logic during development. However, with the continuous complexity of business and the development of single-page applications, mobile HTML5 applications, Node.js programs, etc., phenomena such as lagging and memory overflow caused by memory problems in JavaScript are no longer unfamiliar.

This article will discuss memory usage and optimization from the language level of JavaScript. From the aspects that everyone is familiar with or have heard of, to the areas that most of the time you will not notice, we will analyze them one by one.

1. Memory management at the language level

1.1 Scope

Scope is a very important operating mechanism in JavaScript programming. In synchronous JavaScript programming, it It does not fully attract the attention of beginners, but in asynchronous programming, good scope control skills have become an essential skill for JavaScript developers. Additionally, scope plays a crucial role in JavaScript memory management.

In JavaScript, the scope can be formed by function calls, with statements and global scope.

Take the following code as an example:

var foo = function() {
var local = {};
};
foo();
console.log(local); //=> undefined

var bar = function() {
local = {};
};
bar();
console.log(local); //=> {}

Here we define the foo() function and bar() function. Their intention is to define a variable named local. But the end result was completely different.

In the foo() function, we use the var statement to declare and define a local variable. Since a scope will be formed inside the function body, this variable is defined in the scope. Moreover, the body of the foo() function does not do any scope extension processing, so after the function is executed, the local variable is also destroyed. The variable cannot be accessed in the outer scope.

In the bar() function, the local variable is not declared using the var statement. Instead, local is defined directly as a global variable. Therefore, the outer scope can access this variable.

local = {};
// 这里的定义等效于
global.local = {};

1.2 Scope chain

In JavaScript programming, you will definitely encounter scenarios with multiple levels of function nesting. This is the representation of a typical scope chain.

As shown in the following code:

function foo() {
  var val = &#39;hello&#39;;

  function bar() {
    function baz() {
      global.val = &#39;world;&#39;
    }
    baz();
    console.log(val); //=> hello
  }
  bar();
}
foo();

Based on the previous description of scope, you may think that the result displayed by the code here is world. But the actual result is hello. Many beginners will start to get confused here, so let's take a look at how this code works.

Because in JavaScript, the search for variable identifiers starts from the current scope and searches outward until the global scope. Therefore, access to variables in JavaScript code can only be done outward, not the other way around.

JavaScript also talks about memory optimization

#The execution of the baz() function defines a global variable val in the global scope. In the bar() function, when accessing the identifier val, the search principle is from inside to outside: if it is not found in the scope of the bar function, it goes to the upper level, that is, the scope of the foo() function. Search in scope.

However, the key to making everyone confused is here: this identifier access finds a matching variable in the scope of the foo() function, so it will not continue to look outside, so in baz() ) The global variable val defined in the function has no impact on this variable access.

1.3 Closure

We know that identifier search in JavaScript follows the inside-out principle. However, with the complexity of business logic, a single delivery sequence is far from meeting the increasing new needs.

Let’s take a look at the following code first:

function foo() {
  var local = &#39;Hello&#39;;
  return function() {
    return local;
  };
}
var bar = foo();
console.log(bar()); //=> Hello

The technology shown here to allow the outer scope to access the inner scope is closure (Closure). Thanks to the application of higher-order functions, the scope of the foo() function has been "extended".

The foo() function returns an anonymous function, which exists in the scope of the foo() function, so you can access the local variables in the scope of the foo() function and save its reference. Since this function directly returns the local variable, the bar() function can be directly executed in the outer scope to obtain the local variable.

Closure is an advanced feature of JavaScript. We can use it to achieve more complex effects to meet different needs. However, it should be noted that because the function with internal variable references is taken out of the function, the variables in the scope will not necessarily be destroyed after the function is executed until all references to the internal variables are released. Therefore, the application of closures can easily cause memory to be unable to be released.

2. JavaScript's memory recycling mechanism

Here I will take the V8 engine launched by Google used by Chrome and Node.js as an example to briefly introduce the memory recycling mechanism of JavaScript. For more detailed information, you can purchase my good friend Pu Ling's book "Node.js in a Simple and Easy Way" to study. The chapter "Memory Control" has a quite detailed introduction.

In V8, all JavaScript objects are allocated memory through the "heap".

JavaScript also talks about memory optimization

当我们在代码中声明变量并赋值时，V8 就会在堆内存中分配一部分给这个变量。如果已申请的内存不足以存储这个变量时，V8 就会继续申请内存，直到堆的大小达到了V8 的内存上限为止。默认情况下，V8 的堆内存的大小上限在64位系统中为1464MB，在32位系统中则为732MB，即约1.4GB 和0.7GB。

另外，V8 对堆内存中的JavaScript 对象进行分代管理：新生代和老生代。新生代即存活周期较短的JavaScript 对象，如临时变量、字符串等；而老生代则为经过多次垃圾回收仍然存活，存活周期较长的对象，如主控制器、服务器对象等。

垃圾回收算法一直是编程语言的研发中是否重要的一环，而V8 中所使用的垃圾回收算法主要有以下几种：

1.Scavange 算法：通过复制的方式进行内存空间管理，主要用于新生代的内存空间；
2.Mark-Sweep 算法和Mark-Compact 算法：通过标记来对堆内存进行整理和回收，主要用于老生代对象的检查和回收。

PS: 更详细的V8 垃圾回收实现可以通过阅读相关书籍、文档和源代码进行学习。

我们再来看看JavaScript 引擎在什么情况下会对哪些对象进行回收。

2.1 作用域与引用

初学者常常会误认为当函数执行完毕时，在函数内部所声明的对象就会被销毁。但实际上这样理解并不严谨和全面，很容易被其导致混淆。

引用(Reference)是JavaScript 编程中十分重要的一个机制，但奇怪的是一般的开发者都不会刻意注意它、甚至不了解它。引用是指『代码对对象的访问』这一抽象关系，它与C/C++ 的指针有点相似，但并非同物。引用同时也是JavaScript 引擎在进行垃圾回收中最关键的一个机制。

以下面代码为例：

// ......
var val = &#39;hello world&#39;;
function foo() {
  return function() {
    return val;
  };
}
global.bar = foo();
// ......

阅读完这段代码，你能否说出这部分代码在执行过后，有哪些对象是依然存活的么？

根据相关原则，这段代码中没有被回收释放的对象有val和bar()，究竟是什么原因使他们无法被回收？

JavaScript 引擎是如何进行垃圾回收的？前面说到的垃圾回收算法只是用在回收时的，那么它是如何知道哪些对象可以被回收，哪些对象需要继续生存呢？答案就是JavaScript 对象的引用。

JavaScript 代码中，哪怕是简单的写下一个变量名称作为单独一行而不做任何操作，JavaScript 引擎都会认为这是对对象的访问行为，存在了对对象的引用。为了保证垃圾回收的行为不影响程序逻辑的运行，JavaScript 引擎就决不能把正在使用的对象进行回收，不然就乱套了。所以判断对象是否正在使用中的标准，就是是否仍然存在对该对象的引用。但事实上，这是一种妥协的做法，因为JavaScript 的引用是可以进行转移的，那么就有可能出现某些引用被带到了全局作用域，但事实上在业务逻辑里已经不需要对其进行访问了，应该被回收，但是JavaScript 引擎仍会死板地认为程序仍然需要它。

如何用正确的姿势使用变量、引用，正是从语言层面优化JavaScript 的关键所在。

3. 优化你的JavaScript

终于进入正题了，非常感谢你秉着耐心看到了这里，经过上面这么多介绍，相信你已经对JavaScript 的内存管理机制有了不错的理解，那么下面的技巧将会让你如虎添翼。

3.1 善用函数

如果你有阅读优秀JavaScript 项目的习惯的话，你会发现，很多大牛在开发前端JavaScript 代码的时候，常常会使用一个匿名函数在代码的最外层进行包裹。

(function() {
  // 主业务代码
})();

有的甚至更高级一点：

;(function(win, doc, $, undefined) {
  // 主业务代码
})(window, document, jQuery);

甚至连如RequireJS, SeaJS, OzJS 等前端模块化加载解决方案，都是采用类似的形式：

// RequireJS
define([&#39;jquery&#39;], function($) {
  // 主业务代码
});

// SeaJS
define(&#39;module&#39;, [&#39;dep&#39;, &#39;underscore&#39;], function($, _) {
  // 主业务代码
});

如果你说很多Node.js 开源项目的代码都没有这样处理的话，那你就错了。Node.js 在实际运行代码之前，会把每一个.js 文件进行包装，变成如下的形式：

(function(exports, require, module, __dirname, __filename) {
  // 主业务代码
});

这样做有什么好处？我们都知道文章开始的时候就说了，JavaScript中能形成作用域的有函数的调用、with语句和全局作用域。而我们也知道，被定义在全局作用域的对象，很有可能是会一直存活到进程退出的，如果是一个很大的对象，那就麻烦了。比如有的人喜欢在JavaScript中做模版渲染：

<?php
  $db = mysqli_connect(server, user, password, &#39;myapp&#39;);
  $topics = mysqli_query($db, "SELECT * FROM topics;");
?>
<!doctype html>
<html>
<head>
  <meta charset="UTF-8">
  <title>你是猴子请来的逗比么？</title>
</head>
<body>
  <ul id="topics"></ul>
  <script type="text/tmpl" id="topic-tmpl">
    <li>
      <h1><%=title%></h1>
      <p><%=content%></p>
    </li>
  </script>
  <script type="text/javascript">
    var data = <?php echo json_encode($topics); ?>;
    var topicTmpl = document.querySelector(&#39;#topic-tmpl&#39;).innerHTML;
    var render = function(tmlp, view) {
      var complied = tmlp
        .replace(/\n/g, &#39;\\n&#39;)
        .replace(/<%=([\s\S]+?)%>/g, function(match, code) {
          return &#39;" + escape(&#39; + code + &#39;) + "&#39;;
        });

      complied = [
        &#39;var res = "";&#39;,
        &#39;with (view || {}) {&#39;,
          &#39;res = "&#39; + complied + &#39;";&#39;,
        &#39;}&#39;,
        &#39;return res;&#39;
      ].join(&#39;\n&#39;);

      var fn = new Function(&#39;view&#39;, complied);
      return fn(view);
    };

    var topics = document.querySelector(&#39;#topics&#39;);
    function init()     
      data.forEach(function(topic) {
        topics.innerHTML += render(topicTmpl, topic);
      });
    }
    init();
  </script>
</body>
</html>

这种代码在新手的作品中经常能看得到，这里存在什么问题呢？如果在从数据库中获取到的数据的量是非常大的话，前端完成模板渲染以后，data变量便被闲置在一边。可因为这个变量是被定义在全局作用域中的，所以JavaScript引擎不会将其回收销毁。如此该变量就会一直存在于老生代堆内存中，直到页面被关闭。

可是如果我们作出一些很简单的修改，在逻辑代码外包装一层函数，这样效果就大不同了。当UI渲染完成之后，代码对data的引用也就随之解除，而在最外层函数执行完毕时，JavaScript引擎就开始对其中的对象进行检查，data也就可以随之被回收。

3.2 绝对不要定义全局变量

我们刚才也谈到了，当一个变量被定义在全局作用域中，默认情况下JavaScript 引擎就不会将其回收销毁。如此该变量就会一直存在于老生代堆内存中，直到页面被关闭。

那么我们就一直遵循一个原则：绝对不要使用全局变量。虽然全局变量在开发中确实很省事，但是全局变量所导致的问题远比其所带来的方便更严重。

使变量不易被回收；
1.多人协作时容易产生混淆；
2.在作用域链中容易被干扰。
3.配合上面的包装函数，我们也可以通过包装函数来处理『全局变量』。

3.3 手工解除变量引用

如果在业务代码中，一个变量已经确切是不再需要了，那么就可以手工解除变量引用，以使其被回收。

var data = { /* some big data */ };
// blah blah blah
data = null;

3.4 善用回调

除了使用闭包进行内部变量访问，我们还可以使用现在十分流行的回调函数来进行业务处理。

function getData(callback) {
  var data = &#39;some big data&#39;;

  callback(null, data);
}

getData(function(err, data) {
  console.log(data);

回调函数是一种后续传递风格(Continuation Passing Style, CPS)的技术，这种风格的程序编写将函数的业务重点从返回值转移到回调函数中去。而且其相比闭包的好处也不少：

1.如果传入的参数是基础类型（如字符串、数值），回调函数中传入的形参就会是复制值，业务代码使用完毕以后，更容易被回收；
2.通过回调，我们除了可以完成同步的请求外，还可以用在异步编程中，这也就是现在非常流行的一种编写风格；
3.回调函数自身通常也是临时的匿名函数，一旦请求函数执行完毕，回调函数自身的引用就会被解除，自身也得到回收。

3.5 良好的闭包管理

当我们的业务需求(如循环事件绑定、私有属性、含参回调等)一定要使用闭包时，请谨慎对待其中的细节。

循环绑定事件可谓是JavaScript 闭包入门的必修课，我们假设一个场景：有六个按钮，分别对应六种事件，当用户点击按钮时，在指定的地方输出相应的事件。

var btns = document.querySelectorAll(&#39;.btn&#39;); // 6 elements
var output = document.querySelector(&#39;#output&#39;);
var events = [1, 2, 3, 4, 5, 6];

// Case 1
for (var i = 0; i < btns.length; i++) {
  btns[i].onclick = function(evt) {
    output.innerText += &#39;Clicked &#39; + events[i];
  };
}

// Case 2
for (var i = 0; i < btns.length; i++) {
  btns[i].onclick = (function(index) {
    return function(evt) {
      output.innerText += &#39;Clicked &#39; + events[index];
    };
  })(i);
}

// Case 3
for (var i = 0; i < btns.length; i++) {
  btns[i].onclick = (function(event) {
    return function(evt) {
      output.innerText += &#39;Clicked &#39; + event;
    };
  })(events[i]);
}

这里第一个解决方案显然是典型的循环绑定事件错误，这里不细说，详细可以参照我给一个网友的回答；而第二和第三个方案的区别就在于闭包传入的参数。

第二个方案传入的参数是当前循环下标，而后者是直接传入相应的事件对象。事实上，后者更适合在大量数据应用的时候，因为在JavaScript的函数式编程中，函数调用时传入的参数是基本类型对象，那么在函数体内得到的形参会是一个复制值，这样这个值就被当作一个局部变量定义在函数体的作用域内，在完成事件绑定之后就可以对events变量进行手工解除引用，以减轻外层作用域中的内存占用了。而且当某个元素被删除时，相应的事件监听函数、事件对象、闭包函数也随之被销毁回收。

3.6 内存不是缓存

缓存在业务开发中的作用举足轻重，可以减轻时空资源的负担。但需要注意的是，不要轻易将内存当作缓存使用。内存对于任何程序开发来说都是寸土寸金的东西，如果不是很重要的资源，请不要直接放在内存中，或者制定过期机制，自动销毁过期缓存。

4. 检查JavaScript 的内存使用情况

在平时的开发中，我们也可以借助一些工具来对JavaScript 中内存使用情况进行分析和问题排查。

4.1 Blink / Webkit 浏览器

在Blink / Webkit 浏览器中（Chrome, Safari, Opera etc.），我们可以借助其中的Developer Tools 的Profiles 工具来对我们的程序进行内存检查。

JavaScript also talks about memory optimization

4.2 Node.js 中的内存检查

在Node.js 中，我们可以使用node-heapdump 和node-memwatch 模块进行内存检查。

var heapdump = require(&#39;heapdump&#39;);
var fs = require(&#39;fs&#39;);
var path = require(&#39;path&#39;);
fs.writeFileSync(path.join(__dirname, &#39;app.pid&#39;), process.pid);
// ...

<span style="font-family: Georgia, &#39;Times New Roman&#39;, &#39;Bitstream Charter&#39;, Times, serif; font-size: 14px; line-height: 1.5em;">在业务代码中引入node-heapdump 之后，我们需要在某个运行时期，向Node.js 进程发送SIGUSR2 信号，让node-heapdump 抓拍一份堆内存的快照。</span>

$ kill -USR2 (cat app.pid)

这样在文件目录下会有一个以heapdump-..heapsnapshot格式命名的快照文件，我们可以使用浏览器的Developer Tools中的Profiles工具将其打开，并进行检查。

5. 小结

很快又来到了文章的结束，这篇分享主要向大家展示了以下几点内容：

1. JavaScript is closely related to memory usage at the language level;
2. Memory management and recycling mechanism in JavaScript;
3. How to use memory more efficiently so that the JavaScript produced Can have more vitality for expansion;
4. How to perform memory check when encountering memory problems.

For more articles related to JavaScript and memory optimization, please pay attention to the PHP Chinese website!

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：10 performance optimization tips for nodejsNext article：10 performance optimization tips for nodejs

See more

JavaScript also talks about memory optimization

Related articles