Home >Web Front-end >JS Tutorial >Performance test comparison of js array deduplication methods

Performance test comparison of js array deduplication methods

不言
不言Original
2018-09-20 16:36:262410browse

The content of this article is about the performance test comparison of js array deduplication methods. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

1. Test template

Array deduplication is a commonplace problem. There are rumors circulating on the Internet. Various solutions

In order to test the performance of these solutions, I wrote a test template to calculate the time consumption of array deduplication

// distinct.js

let arr1 = Array.from(new Array(100000), (x, index)=>{
    return index
})

let arr2 = Array.from(new Array(50000), (x, index)=>{
    return index+index
})

let start = new Date().getTime()
console.log('开始数组去重')

function distinct(a, b) {
    // 数组去重
}

console.log('去重后的长度', distinct(arr1, arr2).length)

let end = new Date().getTime()
console.log('耗时', end - start)

Here, two arrays with lengths of 10W and 5W are created respectively

Then merge the two through the distinct() method Array, and remove duplicates

The amount of data is neither large nor small, but it can already explain some problems

2. Array.filter() indexOf

The idea of ​​this method is to splice two arrays into one array, and then use Array.filter() in ES6 to traverse the array , and combined with indexOf to exclude duplicates

function distinct(a, b) {
    let arr = a.concat(b);    
    return arr.filter((item, index)=> {        
    return arr.indexOf(item) === index
    })
}

This is the array deduplication method that I was criticized for. It looks very simple. But actual performance. . .

Yes, the reality is so cruel, it takes 8427ms to process an array with a length of 15W

3. Double for loop

The easiest way to understand, the outer loop traverses the elements, and the inner loop checks whether it is repeated

When there are duplicate values, you can use push() or splice()

function distinct(a, b) {
    let arr = a.concat(b);
    for (let i=0, len=arr.length; i<len; i++) {
        for (let j=i+1; j<len; j++) {
            if (arr[i] == arr[j]) {
                arr.splice(j, 1);
                // splice 会改变数组长度,所以要将数组长度 len 和下标 j 减一
                len--;
                j--;
            }
        }
    }
    return arr
}

But this method takes up a lot of memory and is the lowest efficient

4. for...of includes()

##Upgraded version of double for loop, replace the outer for loop with for...of statement, and change the inner loop to includes ()

First create an empty array, and when includes() returns false, push the element into the empty array

Similarly, you can also use indexOf() instead of includes()

function distinct(a, b) {
    let arr = a.concat(b)
    let result = []
    for (let i of arr) {
        !result.includes(i) && result.push(i)
    }
    return result
}

##This method is quite similar to filter indexOf

Just implement the internal logic of filter() using a for loop, and then replace indexOf with includes

So the duration is relatively close

5. Array.sort()

First Use sort() to sort the array

Then compare adjacent elements for equality to eliminate duplicates

function distinct(a, b) {
    let arr = a.concat(b)
    arr = arr.sort()
    let result = [arr[0]]

    for (let i=1, len=arr.length; i<len; i++) {
        arr[i] !== arr[i-1] && result.push(arr[i])
    }
    return result
}

This method only performs one sorting and one loop, so the efficiency will be higher than the above method

6. new Set()

ES6 adds the Set data structure, which is similar to an array, but

The members of Set are unique

Based on this feature, it is very suitable for array deduplication

function distinct(a, b) {
    return Array.from(new Set([...a, ...b]))
}

How long does it take to process 15W of data using Set?

Meow meow meow? ? ? 57ms? ? Am I not dazzled? ?

然后我在两个数组长度后面分别加了一个0,在 150W 的数据量之下...

居然有如此高性能且简洁的数组去重办法?!

七、for...of + Object

这个方法我只在一些文章里见过,实际工作中倒没怎么用

首先创建一个空对象,然后用 for 循环遍历

利用对象的属性不会重复这一特性,校验数组元素是否重复

function distinct(a, b) {
    let arr = a.concat(b)
    let result = []
    let obj = {}

    for (let i of arr) {
        if (!obj[i]) {
            result.push(i)
            obj[i] = 1
        }
    }

    return result
}

当我看到这个方法的处理时长,我又傻眼了

15W 的数据居然只要 16ms ??? 比 Set() 还快???

然后我又试了试 150W 的数据量...

The above is the detailed content of Performance test comparison of js array deduplication methods. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn