Home  >  Article  >  Java  >  Process a large csv file with parallel processing #eg39

Process a large csv file with parallel processing #eg39

WBOY
WBOYOriginal
2024-09-12 10:16:54680browse

A csv file stores a large amount orders data.

Process a large csv file with parallel processing #eg39

Use Java to process this file: Find orders whose amounts are between 3,000 and 5,000, group them by customers, and sum order amounts and count orders.

Process a large csv file with parallel processing #eg39
Write the following SPL statement:

=file("d:/OrdersBig.csv").cursor@mtc(;8).select(Amount>=3000 && Amount<5000).groups(Client;sum(Amount):amt,count(1):cnt)

cursor() function parses a large file that cannot fit into the memory; by default, it performs the serial computation. @m option enables multithreaded data retrieval; 8 is the number of parallel threads; @t option enables importing the first line as column titles; and @c option enables using comma as the separator.

Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.

This is one of the problems on StackOverflow. You can click on it to see that the conventional solution is quite complicated, but the SPL approach is really simple and efficient.

SPL open source address

The above is the detailed content of Process a large csv file with parallel processing #eg39. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn