Home >Web Front-end >JS Tutorial >Multiple filtering of large amounts of data in JavaScript
All codes use ES2015 syntax. Those who need ES5 syntax can use Babel - Try it out or TypeScript Playground to translate.
Question raised
A friend asked me a question today. The front-end obtains a large amount of data from the back-end through Ajax. It needs to be filtered according to some conditions. The filtering method is like this:
class Filter { filterA(s) { let data = this.filterData || this.data; this.filterData = data.filter(m => m.a === s); } filterB(s) { let data = this.filterData || this.data; this.filterData = data.filter(m => m.b === s); } }
Now I am confused, I think this is how to process the data It’s wrong, but I don’t know how to deal with it.
Found the problem
The problem lies in filtering. Although multiple filtering can be achieved (this can be achieved by calling filterA() first and then filterB()), this filtering is irreversible. Suppose the filtering process is like this:
f.filterA("a1"); f.filterB("b1"); f.filterA("a2");
Originally, I wanted to filter the data by "a1" and "b1", and then modify the first condition to "a2", but the result turned out to be an empty set.
Solve the problem
If you find a problem, solve it accordingly. Since this problem is caused by the irreversible filtering process, the problem can be solved by starting filtering directly from this.data every time instead of starting from this.filterData. If you want to do this, you need to record the selected filter conditions first.
Record filter conditions
It is certainly feasible to record filter conditions in a list, but note that two filters for the same condition are mutually exclusive, and only the last one can be retained, so HashMap should be more appropriate.
class Filter { constructor() { this.filters = {}; } set(key, filter) { this.filters[key] = filter; } getFilters() { return Object.keys(this.filters).map(key => this.filters[key]); } }
In this case, the process like the above is expressed as
f.set("A", m => m.a === "a1"); f.set("B", m => m.b === "b1"); f.set("A", m => m.a === "a1"); let filters = f.getFilters(); // length === 2;
The filter set in the 3rd sentence above covers the one set in the 1st sentence. Now use the last obtained filters to filter the original data this.data in order, and you will get the correct result.
Some people may think that the list returned by getFilters() is not in the order of set - indeed, this is the characteristic of HashMap, which is unordered. However, for the judgment of simple conditions, no matter who comes first, the result is the same. But for some compound condition judgments, it may have an impact.
If necessary, you can use array instead of map to solve the order problem, but this will reduce the search efficiency (linear search). If you still want to solve the problem of search efficiency, you can use array + map. Not much to say here.
Filtering
In fact, when using it, it is really slow to use getFilter() and use a loop to process it every time. Since data is encapsulated into Filter, you can consider directly giving a filter() method to deliver the filtering interface.
class Filter { filter() { let data = this.data; for (let f of this.getFilters()) { data = data.filter(f); } return data; } }
However, I think this is not very efficient, especially when dealing with a large amount of data. You might as well take advantage of lodash's delayed processing.
Using lodash's delayed processing
filter() { let chain = _(this.data); for (let f of this.getFilters()) { chain = chain.filter(f); } return chain.value(); }
lodash will enable delayed processing when the data is greater than 200, that is to say, it will process it into a loop and call each filter in turn, instead of looping each filter once.
The difference between delayed processing and non-delayed processing can be seen in the figure below. Non-delayed processing will perform a total of n (here n = 3) large loops, producing n - 1 intermediate results. Delayed processing will only perform a large loop, and no intermediate results will be generated.
But to be honest, I don’t like loading an extra library for a small thing, so I just make a simple implementation myself
Implement the delay processing myself
filter() { const filters = this.getFilters(); return data.filter(m => { for (let f of filters) { // 如果某个 filter 已经把它过滤掉了,也不用再用后面的 filter 来判断了 if (!f(m)) { return false; } } return true; }); }
The for loop inside can also use Array.prototype. every to simplify:
filter() { const filters = this.getFilters(); return data.filter(m => { return filters.every(f => f(m)); }); }
Data filtering is actually not a complicated matter, as long as you clarify your ideas and understand what data needs to be retained, what data is temporary (intermediate process), and what data is the final result... Utilize The related methods in Array.prototype, or tools like lodash, can be easily handled.