Home >Web Front-end >JS Tutorial >Let's talk about V8's memory management and garbage collection algorithm

Let's talk about V8's memory management and garbage collection algorithm

青灯夜游
青灯夜游forward
2022-04-27 20:44:483057browse

This article will take you to understand the memory management and garbage collection algorithm of the V8 engine. I hope it will be helpful to you!

Let's talk about V8's memory management and garbage collection algorithm

As we all know, JS automatically manages garbage collection, and developers do not need to care about memory allocation and recycling. Moreover, the garbage collection mechanism is also a commonly tested part in front-end interviews. This article mainly explains the generational garbage collection algorithm of V8. I hope that after reading this article, friends can have a painful about the V8garbage collection mechanism (haha, it is painful!!!), the article mainly covers the following content:

  • V8Memory limitations and solutions
  • New generation memory objectsScavengeAlgorithm
  • Based on reachability analysis algorithmLogic and optimization methods for marking surviving objects
  • Promotion conditions for new generation memory objects,
  • ScavengeThe depth/breadth first difference of the algorithm
  • Cross-generation memory write barrier
  • Mark clearing/organizing algorithm for old generation memory objects
  • GCSTWCauses and optimization strategies

V8 memory limitations and solutions

V8 initially Designed for browsers, there are fewer scenarios where large memory is used. There are restrictions on memory usage by default in the design, and only part of the memory is allowed to be used. The 64-bit system allows about 1.4g of memory, and the 32-bit system allows about 0.7g of memory. As shown in the following code, check the memory limit method of the dependent V8 engine in Node:

process.memoryUsage();

// 返回内存的使用量,单位字节
{
  rss: 22953984,
  // 申请的总的堆内存
  heapTotal: 9682944,
  // 已使用的堆内存
  heapUsed: 5290344,
  external: 9388
}

Lets talk about V8s memory management and garbage collection algorithm

V8There is another way to limit the memory usage size The important reason is that when the heap memory is too large, V8 takes a long time to perform garbage collection (1.5g takes 50ms), and it takes longer to do non-incremental garbage collection. Long time (1.5g to 1s). After explaining the garbage collection mechanism of V8 later, I believe everyone will be able to relate to it more.

Although the V8 engine has limits on memory usage, the method to modify the memory limit is also exposed, which is to add relevant parameters when starting the V8 engine. The following code is demonstrated in Modify the dependent V8 engine memory limit in Node:

# 更改老生代的内存限制,单位mb
node --max-old-space-size=2048 index.js

# 更改新生代的内存限制,单位mb
node --max-semi-space-size=1024=64 index.js

It should be noted here that the syntax of the changed new generation memory has been changed to the above writing method, and The unit has also changed from kb to mb. The old way of writing is node --max-new-space-size. You can query the current ## through the following command. #NodeEnvironment to modify the syntax of the new generation memory:

node --v8-options | grep max

Lets talk about V8s memory management and garbage collection algorithm

V8 garbage collection strategy

Automatic garbage collection in the engine In the historical evolution of the recycling mechanism, people found that there is no universal algorithm that can solve garbage collection in any scenario. Therefore, modern garbage collection algorithms

divide memory garbage into generations based on the survival time of objects. Generational garbage collection algorithms implement different methods for different types of memory garbage. recycling algorithm.

V8

Divide the memory into two types: New generation and Old generation:

In the new generation memory The survival time of objects in the old generation memory is shorter
  • The generation objects in the old generation memory have a longer survival time or they are resident in memory
  • The new generation memory is stored in the new generation memory space (
semispace

), the old generation memory is stored in the old generation memory space (oldspace), as shown in the following figure:

Lets talk about V8s memory management and garbage collection algorithm

New student Generation memory uses
    Scavenge
  • algorithm Old generation memory uses
  • Mark-Sweep
  • and Mark-Compact algorithms
  • Let’s take a look at the algorithm logic of
Scavenge

!

Scavenge Algorithm

For memory recycling of new generation memory, the

Scavenge

algorithm is used. The specific implementation of Scavenge is CheneyAlgorithm. CheneyThe algorithm is to divide the new generation memory space into two, one space is in use (FromSpace), and the other space is in idle state (called ToSpace) .

在内存开始分配时,首先在FromSpace中进行分配,垃圾回收机制执行时会检查FromSpace中的存活对象,存活对象会被会被复制到ToSpace,非存活对象所占用的空间将被释放,复制完成后FromSpaceToSpace的角色将翻转。当一个对象多次复制后依然处于存活状态,则认为其是长期存活对象,此时将发生晋升,然后该对象被移动到老生代空间oldSpace中,采用新的算法进行管理。

Lets talk about V8s memory management and garbage collection algorithm

Scavenge算法其实就是在两个空间内来回复制存活对象,是典型的空间换时间做法,所以非常适合新生代内存,因为仅复制存活的对象且新生代内存中存活对象是占少数的。但是有如下几个重要问题需要考虑:

  • 引用避免重复拷贝

假设存在三个对象temp1、temp2、temp3,其中temp2、temp3都引用了temp1,js代码示例如下:

var temp2 = {
  ref: temp1,
}

var temp3 = {
  ref: temp1,
}

var temp1 = {}

FromSpace中拷贝temp2ToSpace中时,发现引用了temp1,便把temp1也拷贝到ToSpace,是一个递归的过程。但是在拷贝temp3时发现也引用了temp1,此时再把temp1拷贝过去则重复了。

要避免重复拷贝,做法是拷贝时给对象添加一个标记visited表示该节点已被访问过,后续通过visited属性判断是否拷贝对象。

  • 拷贝后保持正确的引用关系

还是上述引用关系,由于temp1不需要重复拷贝,temp3被拷贝到ToSpace之后不知道temp1对象在ToSpace中的内存地址。

做法是temp1被拷贝过去后该对象节点上会生成新的field属性指向新的内存空间地址,同时更新到旧内存对象的forwarding属性上,因此temp3就可以通过旧temp1forwarding属性找到在ToSpace中的引用地址了。

内存对象同时存在于新生代和老生代之后,也带来了问题:

  • 内存对象跨代(跨空间)后如何标记
const temp1 = {}

const temp2 = {
  ref: temp1,
}

比如上述代码中的两个对象temp1temp2都存在于新生代,其中temp2引用了temp1。假设在经过GC之后temp2晋升到了老生代,那么在下次GC的标记阶段,如何判断temp1是否是存活对象呢?

在基于可达性分析算法中要知道temp1是否存活,就必须要知道是否有根对象引用引用了temp1对象。如此的话,年轻代的GC就要遍历所有的老生代对象判断是否有根引用对象引用了temp1对象,如此的话分代算法就没有意义了。

解决版本就是维护一个记录所有的跨代引用的记录集,它是写缓冲区的一个列表。只要有老生代中的内存对象指向了新生代内存对象时,就将老生代中该对象的内存引用记录到记录集中。由于这种情况一般发生在对象写的操作,顾称此为写屏障,还一种可能的情况就是发生在晋升时。记录集的维护只要关心对象的写操作和晋升操作即可。此是又带来了另一个问题:

  • 每次写操作时维护记录集的额外开销

优化的手段是在一些Crankshaft操作中是不需要写屏障的,还有就是栈上内存对象的写操作是不需要写屏障的。还有一些,更多的手段就不在这里过多讨论。

  • 缓解Scavenge算法内存利用率不高问题

新生代内存中存活对象占比是相对较小的,因此可以在分配空间时,ToSpace可以分配的小一些。做法是将ToSpace空间分成S0S1两部分,S0用作于ToSpaceS1与原FromSpace合并当成FromSpace

Lets talk about V8s memory management and garbage collection algorithm

Scavenge算法中深度/广度优先的区别

垃圾回收算法中,识别内存对象是否是垃圾的机制一般有两种:引用计数基于可达性分析

Based on reachability analysis, it is to find all root references (such as global variables, etc.), traverse all root references, all references on the recursive root reference, everything that is traversed is Live objects and mark them. At this time, other memory objects in the space are dead objects, thus constructing a directed graph.

Taking into account the limitations of recursion, recursive logic generally adopts non-recursive implementation. Common ones include breadth-first and depth-first algorithms. The difference between the two is:

  • changes the order of memory objects when copying depth-first to ToSpace, making objects with reference relationships closer. The reason is that after copying itself, the object referenced by itself is copied directly, so the related objects are closer in ToSpace
  • Depth priority is just the opposite

Because of the CPU's caching strategy, when reading a memory object, there is a high probability that the objects behind it will be read together, in order to hit the cache faster. Because a very common scenario during code development is obj1.obj2.obj3. At this time, when the CPU reads obj1, if the following obj2, If obj3 is read together, it will be very helpful to hit the cache.

So the depth-first algorithm is more conducive to business logic hitting the cache, but its implementation requires additional stack-assisted implementation of the algorithm, which consumes memory space. Breadth first, on the contrary, cannot improve cache hits, but its implementation can use pointers to cleverly avoid space consumption, and the algorithm execution efficiency is high.

Promotion conditions for new generation memory objects

If memory objects in the new generation want to be promoted to the old generation, they need to meet the following conditions:

  • Whether the object has been experiencedScavengeRecycling
  • ToSpaceThe memory usage ratio cannot exceed the limit

Judge whether it has been experienced# The logic of ##Scavenge's GC is to give the age attribute of the surviving object 1 every time GC, and when GC## is done again Just judge the age attribute when #. The basic promotion diagram is as follows:

Lets talk about V8s memory management and garbage collection algorithmThere are many long-term surviving objects in the old generation memory, and the reason why the

Scavenge

algorithm cannot be recycled is:

More surviving objects lead to low copy efficiency
  • Wasted half of the memory space
Recycling algorithm of old generation memory objects

Garbage collection of the old generation memory space uses

Mark-Sweep

(Mark-Sweep) and Mark-Sweep(Mark -Compact) combination method. Marking and clearing is divided into two parts:

Marking phase
  • Clearing phase (if it is marking and sorting, it is the sorting phase)
  • Traverse the old students in the marking phase All memory objects in the heap memory are generated and live objects are marked. The cleanup phase only cleans up unmarked objects. The reason is: non-survival objects account for a minority in the old generation memory.

Lets talk about V8s memory management and garbage collection algorithmAs shown in the figure above, one problem with mark clearing is that after clearing, there is a discontinuous space that cannot be used anymore, so memory cleaning of the old generation memory space is required. A solution combined with tagging. This solution is to move the living objects to one side during the marking process, and then clean up and remove all non-living objects outside the boundary after the movement is completed.

Lets talk about V8s memory management and garbage collection algorithm

Full pause of garbage collection

It is necessary to pause the application execution logic during garbage collection, and then resume the application after the garbage collection mechanism is completed Execution logic, this behavior is called "

Full Pause

", which is often called Stop The World, referred to as STW. Garbage collection of young generation memory has little impact on application execution, but because there are many surviving objects in old generation memory, the impact of full pauses caused by garbage collection of old generation memory is very large.

Lets talk about V8s memory management and garbage collection algorithmIn order to optimize the full pause time of GC, V8 also introduces

incremental markers

, concurrency markers,parallel Mark , Incremental cleaning, Parallel cleaning, Delayed cleaning and other methods.

STW Optimization

An important metric to measure the time spent in garbage collection is the amount of time the main thread is paused while executing

GC

. The impact of STW is unacceptable, so V8 has also adopted many optimization methods. <ul><li>Parallel GC</li></ul> <p>The GC process needs to do a lot of things, which will lead to the STW phenomenon on the main thread. The method of parallel GC is to open multiple auxiliary threads to share the GC work. This approach still cannot avoid the STW phenomenon, but it can reduce the total time of STW, depending on the number of auxiliary threads enabled. </p> <p><img src="https://img.php.cn/upload/image/795/713/176/165106317370481Lets%20talk%20about%20V8s%20memory%20management%20and%20garbage%20collection%20algorithm" title="165106317370481Lets talk about V8s memory management and garbage collection algorithm" alt="1Lets talk about V8s memory management and garbage collection algorithm"></p> <ul><li>Incremental<code>GC

Incremental GC splits the GC work and executes it on the main thread Intermittent step-by-step execution. This approach will not reduce the GC time. On the contrary, it will cost a little, but it will also reduce the total STW time of the GC.

1Lets talk about V8s memory management and garbage collection algorithm

  • ConcurrencyGC

Concurrent GC means that GC runs in the background and no longer runs on the main thread . This approach will avoid the STW phenomenon.

1Lets talk about V8s memory management and garbage collection algorithm

  • Idle timeGC

The rendering of animations in Chrome is approx. ##60 frames (each frame is about 16ms), if the current rendering time reaches 16.6ms, then there will be free time to do other things, such as partGCTask.

Lets talk about V8s memory management and garbage collection algorithm

Reduce the impact of garbage collection

If you want to improve execution efficiency, you must minimize the execution and consumption of garbage collection:

  • Be careful when using memory as a cache and be careful about using objects as a cache. To reasonably limit the expiration time and infinite growth issues, you can use the lru strategy

  • Avoid using memory to store user sessions in

    Node. Otherwise, storing a large number of user session objects in the memory will cause the memory of the old generation to surge, affecting cleanup performance and thus affecting application execution performance and memory overflow. Improved ways to use redis, etc. Benefits of moving the cache externally:

      Reduce the number of resident memory objects and make garbage collection more efficient
    • The cache can be shared between processes
For more node-related knowledge, please visit:

nodejs tutorial!

The above is the detailed content of Let's talk about V8's memory management and garbage collection algorithm. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete