Home > Article > Web Front-end > Issues about v8 sorting source code in JavaScript
The twentieth and final article in the JavaScript special series, interpreting the v8 sorting source code
v8 is Chrome's JavaScript engine, in which there are some details about arrays Sorting is entirely implemented in JavaScript.
The algorithm used for sorting is related to the length of the array. When the array length is less than or equal to 10, insertion sort is used. When the array length is greater than 10, quick sort is used. (Of course, this statement is not rigorous).
Let’s take a look at insertion sort and quick sort first.
Treat the first element as an ordered sequence, traverse the array, and insert subsequent elements into the constructed ordered sequence in sequence.
function insertionSort(arr) { for (var i = 1; i < arr.length; i++) { var element = arr[i]; for (var j = i - 1; j >= 0; j--) { var tmp = arr[j]; var order = tmp - element; if (order > 0) { arr[j + 1] = tmp; } else { break; } } arr[j + 1] = element; } return arr; } var arr = [6, 5, 4, 3, 2, 1]; console.log(insertionSort(arr));
The time complexity is It refers to the computational workload required to execute the algorithm. It examines the situation when the size of the input value approaches infinity. Generally, the number of times the basic operations in the algorithm are repeated is a function of the problem size n.
The best case: the array is arranged in ascending order, the time complexity is: O(n)
The worst case: the array is arranged in descending order, the time complexity is: O(n²)
Stability refers to whether the same elements maintain their relative positions after sorting.
It should be noted that for unstable sorting algorithms, just give an example to illustrate its instability; while for stable sorting algorithms, the algorithm must be analyzed to obtain stable characteristics.
For example, [3, 3, 1], after sorting, is still [3, 3, 1], but in fact the second 3 is before the first 3, then this is an unstable sorting algorithm.
Insertion sort is a stable algorithm.
When the array is about to be sorted or the problem size is relatively small, insertion sorting is more efficient. This is why v8 uses insertion sort when the array length is less than or equal to 10.
Select an element as "baseline"
is less than "baseline" Elements that are larger than "base" are moved to the left of "base"; elements that are larger than "base" are moved to the right of "base".
Repeat the first and second steps for the two subsets on the left and right of the "baseline" until only one element remains in all subsets.
The example and the following implementation method come from Teacher Ruan Yifeng's "Javascript Implementation of Quick Sort"
With array[ 85, 24, 63, 45, 17, 31, 96, 50] For example:
In the first step, select the middle element 45 as the "baseline". (The base value can be chosen arbitrarily, but it is easier to understand to choose the middle value.)
The second step, in order, combine each element with The "baseline" is compared, forming two subsets, one "less than 45" and the other "greater than or equal to 45".
The third step is to repeat the first and second steps for the two subsets until only one element remains in all subsets.
var quickSort = function(arr) { if (arr.length <= 1) { return arr; } // 取数组的中间元素作为基准 var pivotIndex = Math.floor(arr.length / 2); var pivot = arr.splice(pivotIndex, 1)[0]; var left = []; var right = []; for (var i = 0; i < arr.length; i++){ if (arr[i] < pivot) { left.push(arr[i]); } else { right.push(arr[i]); } } return quickSort(left).concat([pivot], quickSort(right)); };
However, this implementation requires additional space to store the left and right subsets, so there is another original Implementation of in-place sorting.
Let’s take a look at the implementation diagram of in-place sorting:
In order to let everyone After understanding the principle of quick sort, I slowed down the execution speed.
In this diagram, the value rule for the benchmark is to take the leftmost element. Yellow represents the current benchmark, green represents elements that are smaller than the benchmark, and purple represents elements that are greater than the benchmark.
We will find that the green element will be immediately to the right of the benchmark, the purple element will be moved to the back, and then the benchmark and the last green element will be exchanged. At this time, the benchmark is in the correct position, that is The previous elements are all less than the base value, and the following elements are all greater than the base value. Then take the basis of the previous and following multiple elements and sort them.
function quickSort(arr) { // 交换元素 function swap(arr, a, b) { var temp = arr[a]; arr[a] = arr[b]; arr[b] = temp; } function partition(arr, left, right) { var pivot = arr[left]; var storeIndex = left; for (var i = left + 1; i <= right; i++) { if (arr[i] < pivot) { swap(arr, ++storeIndex, i); } } swap(arr, left, storeIndex); return storeIndex; } function sort(arr, left, right) { if (left < right) { var storeIndex = partition(arr, left, right); sort(arr, left, storeIndex - 1); sort(arr, storeIndex + 1, right); } } sort(arr, 0, arr.length - 1); return arr; } console.log(quickSort(6, 7, 3, 4, 1, 5, 9, 2, 8))
Quick sort is an unstable sort. If you want to prove that a sorting is unstable, you only need to give an example.
So let’s give one~
就以数组 [1, 2, 3, 3, 4, 5] 为例,因为基准的选择不确定,假如选定了第三个元素(也就是第一个 3) 为基准,所有小于 3 的元素在前面,大于等于 3 的在后面,排序的结果没有问题。可是如果选择了第四个元素(也就是第二个 3 ),小于 3 的在基准前面,大于等于 3 的在基准后面,第一个 3 就会被移动到 第二个 3 后面,所以快速排序是不稳定的排序。
阮一峰老师的实现中,基准取的是中间元素,而原地排序中基准取最左边的元素。快速排序的关键点就在于基准的选择,选取不同的基准时,会有不同性能表现。
快速排序的时间复杂度最好为 O(nlogn),可是为什么是 nlogn 呢?来一个并不严谨的证明:
在最佳情况下,每一次都平分整个数组。假设数组有 n 个元素,其递归的深度就为 log2n + 1,时间复杂度为 O(n)[(log2n + 1)],因为时间复杂度考察当输入值大小趋近无穷时的情况,所以会忽略低阶项,时间复杂度为:o(nlog2n)。
如果一个程序的运行时间是对数级的,则随着 n 的增大程序会渐渐慢下来。如果底数是 10,lg1000 等于 3,如果 n 为 1000000,lgn 等于 6,仅为之前的两倍。如果底数为 2,log21000 的值约为 10,log21000000 的值约为 19,约为之前的两倍。我们可以发现任意底数的一个对数函数其实都相差一个常数倍而已。所以我们认为 O(logn)已经可以表达所有底数的对数了,所以时间复杂度最后为: O(nlogn)。
而在最差情况下,如果对一个已经排序好的数组,每次选择基准元素时总是选择第一个元素或者最后一个元素,那么每次都会有一个子集是空的,递归的层数将达到 n,最后导致算法的时间复杂度退化为 O(n²)。
这也充分说明了一个基准的选择是多么的重要,而 v8 为了提高性能,就对基准的选择做了很多优化。
v8 选择基准的原理是从头和尾之外再选择一个元素,然后三个值排序取中间值。
当数组长度大于 10 但是小于 1000 的时候,取中间位置的元素,实现代码为:
// 基准的下标 // >> 1 相当于除以 2 (忽略余数) third_index = from + ((to - from) >> 1);
当数组长度大于 1000 的时候,每隔 200 ~ 215 个元素取一个值,然后将这些值进行排序,取中间值的下标,实现的代码为:
// 简单处理过 function GetThirdIndex(a, from, to) { var t_array = new Array(); // & 位运算符 var increment = 200 + ((to - from) & 15); var j = 0; from += 1; to -= 1; for (var i = from; i < to; i += increment) { t_array[j] = [i, a[i]]; j++; } // 对随机挑选的这些值进行排序 t_array.sort(function(a, b) { return comparefn(a[1], b[1]); }); // 取中间值的下标 var third_index = t_array[t_array.length >> 1][0]; return third_index; }
也许你会好奇 200 + ((to - from) & 15)
是什么意思?
&
表示是按位与,对整数操作数逐位执行布尔与操作。只有两个操作数中相对应的位都是 1,结果中的这一位才是 1。
以 15 & 127
为例:
15 二进制为: (0000 1111)
127 二进制为:(1111 1111)
按位与结果为:(0000 1111)= 15
所以 15 & 127
的结果为 15
。
注意 15 的二进制为: 1111
,这就意味着任何和 15 按位与的结果都会小于或者等于 15,这才实现了每隔 200 ~ 215 个元素取一个值。
终于到了看源码的时刻!源码地址为:https://github.com/v8/v8/blob/master/src/js/array.js#L758。
function InsertionSort(a, from, to) { for (var i = from + 1; i < to; i++) { var element = a[i]; for (var j = i - 1; j >= from; j--) { var tmp = a[j]; var order = comparefn(tmp, element); if (order > 0) { a[j + 1] = tmp; } else { break; } } a[j + 1] = element; } }; function QuickSort(a, from, to) { var third_index = 0; while (true) { // Insertion sort is faster for short arrays. if (to - from <= 10) { InsertionSort(a, from, to); return; } if (to - from > 1000) { third_index = GetThirdIndex(a, from, to); } else { third_index = from + ((to - from) >> 1); } // Find a pivot as the median of first, last and middle element. var v0 = a[from]; var v1 = a[to - 1]; var v2 = a[third_index]; var c01 = comparefn(v0, v1); if (c01 > 0) { // v1 < v0, so swap them. var tmp = v0; v0 = v1; v1 = tmp; } // v0 <= v1. var c02 = comparefn(v0, v2); if (c02 >= 0) { // v2 <= v0 <= v1. var tmp = v0; v0 = v2; v2 = v1; v1 = tmp; } else { // v0 <= v1 && v0 < v2 var c12 = comparefn(v1, v2); if (c12 > 0) { // v0 <= v2 < v1 var tmp = v1; v1 = v2; v2 = tmp; } } // v0 <= v1 <= v2 a[from] = v0; a[to - 1] = v2; var pivot = v1; var low_end = from + 1; // Upper bound of elements lower than pivot. var high_start = to - 1; // Lower bound of elements greater than pivot. a[third_index] = a[low_end]; a[low_end] = pivot; // From low_end to i are elements equal to pivot. // From i to high_start are elements that haven&#39;t been compared yet. partition: for (var i = low_end + 1; i < high_start; i++) { var element = a[i]; var order = comparefn(element, pivot); if (order < 0) { a[i] = a[low_end]; a[low_end] = element; low_end++; } else if (order > 0) { do { high_start--; if (high_start == i) break partition; var top_elem = a[high_start]; order = comparefn(top_elem, pivot); } while (order > 0); a[i] = a[high_start]; a[high_start] = element; if (order < 0) { element = a[i]; a[i] = a[low_end]; a[low_end] = element; low_end++; } } } if (to - high_start < low_end - from) { QuickSort(a, high_start, to); to = low_end; } else { QuickSort(a, from, low_end); from = high_start; } } } var arr = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]; function comparefn(a, b) { return a - b } QuickSort(arr, 0, arr.length) console.log(arr)
我们以数组 [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
为例,分析执行的过程。
1.执行 QuickSort 函数 参数 from 值为 0,参数 to 的值 11。
2.10 c7ba78f8f51b7321cde06a9f72ed98cf> 1) = 5,基准值 a[5] 为 5。
3.比较 a[0] a[10] a[5] 的值,然后根据比较结果修改数组,数组此时为 [0, 9, 8, 7, 6, 5, 4, 3, 2, 1, 10]
4.将基准值和数组的第(from + 1)个即数组的第二个元素互换,此时数组为 [0, 5, 8, 7, 6, 9, 4, 3, 2, 1, 10],此时在基准值 5 前面的元素肯定是小于 5 的,因为第三步已经做了一次比较。后面的元素是未排序的。
我们接下来要做的就是把后面的元素中小于 5 的全部移到 5 的前面。
5.然后我们进入 partition 循环,我们依然以这个数组为例,单独抽出来写个 demo 讲一讲
// 假设代码执行到这里,为了方便演示,我们直接设置 low_end 等变量的值 // 可以直接复制到浏览器中查看数组变换效果 var a = [0, 5, 8, 7, 6, 9, 4, 3, 2, 1, 10] var low_end = 1; var high_start = 10; var pivot = 5; console.log(&#39;起始数组为&#39;, a) partition: for (var i = low_end + 1; i < high_start; i++) { var element = a[i]; console.log(&#39;循环当前的元素为:&#39;, a[i]) var order = element - pivot; if (order < 0) { a[i] = a[low_end]; a[low_end] = element; low_end++; console.log(a) } else if (order > 0) { do { high_start--; if (high_start == i) break partition; var top_elem = a[high_start]; order = top_elem - pivot; } while (order > 0); a[i] = a[high_start]; a[high_start] = element; console.log(a) if (order < 0) { element = a[i]; a[i] = a[low_end]; a[low_end] = element; low_end++; } console.log(a) } } console.log(&#39;最后的结果为&#39;, a) console.log(low_end) console.log(high_start)
6.此时数组为 [0, 5, 8, 7, 6, 9, 4, 3, 2, 1, 10]
,循环从第三个元素开始,a[i] 的值为 8,因为大于基准值 5,即 order > 0,开始执行 do while 循环,do while 循环的目的在于倒序查找元素,找到第一个小于基准值的元素,然后让这个元素跟 a[i] 的位置交换。
第一个小于基准值的元素为 1,然后 1 与 8 交换,数组变成 [0, 5, 1, 7, 6, 9, 4, 3, 2, 8, 10]
。high_start 的值是为了记录倒序查找到哪里了。
7. At this time, the value of a[i] becomes 1, and then exchange 1 with the base value 5, and the array becomes [0, 1, 5, 7, 6, 9, 4, 3 , 2, 8, 10]
, the value of low_end is increased by 1, the value of low_end is to record the location of the reference value.
8. The loop is then executed, traversing the fourth element 7, which is consistent with steps 6 and 7. The array first becomes [0, 1, 5, 2, 6, 9, 4, 3, 7, 8, 10]
, and then becomes [0, 1, 2, 5, 6, 9, 4, 3, 7, 8, 10]
9. Traverse the fifth element 6, the same as steps 6 and 7. The array first becomes [0, 1, 2, 5, 3, 9, 4, 6, 7, 8, 10]
, then becomes [0, 1, 2, 3, 5, 9, 4, 6, 7, 8, 10]
10. Traverse the sixth element 9, followed by Steps 6 and 7 are the same. The array first becomes [0, 1, 2, 3, 5, 4, 9, 6, 7, 8, 10]
, and then becomes [0 , 1, 2, 3, 4, 5, 9, 6, 7, 8, 10]
#11. In the next traversal, because i == high_start, it means forward and reverse order The search finally found them together. The following elements must be greater than the base value. At this time, the loop exits
12. The result after traversal is [0, 1, 2, 3, 4, 5, 9, 6, 7, 8, 10]
, the elements before the base value 5 are all less than 5, and the elements behind are all greater than 5, then we perform QuickSort
13 on the two subsets respectively. This When the low_end value is 5, the high_start value is 6, the value of to is still 10, and the value of from is still 0, the result of to - high_start < low_end - from
is true
, We sort the following elements using QuickSort(a, 6, 10), but note that in the new QuickSort, because the value of from - to is less than 10, insertion sort is actually used this time. So to be precise, When the array length is greater than 10, v8 uses a hybrid sorting method of quick sort and insertion sort.
14. Then to = low_end
that is, set to to 5. Because of while(true), it will be executed again. The value of to - from is 5, and InsertionSort is executed. (a, 0, 5), that is, perform an insertion sort on the element before the reference value.
15. Because there is a return statement in the judgment of to - from <= 10, the while loop ends.
16.v8 After performing a quick sort on the array, then performing insertion sort on the two subsets respectively, and finally modifying the array into a correctly sorted array.
Finally, let’s take a schematic diagram to experience insertion sort and quick sort:
The picture comes from https ://www.toptal.com/developers/sorting-algorithms
JavaScript Special Series Directory address: https://github.com/mqyqingfeng/Blog.
The JavaScript topic series is expected to be about twenty articles, mainly studying the implementation of some functional points in daily development, such as anti-shake, throttling, deduplication, type judgment, copy, maximum value, flattening, curry, Recursion, reordering, sorting, etc. are characterized by studying (chao) the implementation of underscore and jQuery.
The above is the detailed content of Issues about v8 sorting source code in JavaScript. For more information, please follow other related articles on the PHP Chinese website!