search
HomeJavajavaTutorialHow much does Java multi-threaded concurrent programming improve data processing efficiency?

    In the work scenario, we encountered such a requirement: to update related information of other models based on the IP address of the host. The requirements are very simple and only involve general database linkage query and update operations. However, during the coding implementation process, it was found that due to the large number of hosts, it takes a long time to loop through the query and update. It takes about 30-40 seconds to call the interface once. min time to complete the operation.

    Therefore, in order to effectively shorten the execution time of interface methods, consider using multi-threaded concurrent programming methods, taking advantage of the parallel execution capabilities of multi-core processors, and asynchronously processing data, which can greatly shorten the execution time and improve effectiveness.

    A reusable thread pool with a fixed number of threads is used here FixedThreadPool, and the concurrent process control tool provided by the CountDownLatch concurrent tool class is used in conjunction to ensure multi-thread concurrency Normal operation during programming:

    • First, obtain the CPU thread of the running machine through the Runtime.getRuntime().availableProcessors() method Number, used to subsequently set the number of threads in the fixed thread pool.

    • Secondly, , determine the characteristics of the task. If it is a computationally intensive task, set the number of threads to CPU thread number 1, if it is IO For intensive tasks, set the number of threads to 2 * Number of CPU threads . Since the method requires frequent interaction with the database, it is an IO-intensive task.

    • After that, the data is grouped and cut. Each thread processes one grouped data. The number of grouped groups is consistent with the number of threads, and a counter is also created. Object CountDownLatch, call the constructor, the initialization parameter value is the number of threads, ensuring that the main thread waits for all child threads to finish running before performing subsequent operations.

    • Then , call the executorService.execute() method, and rewrite the run method to write business logic and data processing Code, remember to decrement the counter by 1 after executing the current thread. Finally, when all sub-threads are completed, close the thread pool.

    After omitting the business logic code in the work scenario, the general processing method example is as follows:

    public ResponseData updateHostDept() {
    		// ...
    		List<Map> hostMapList = mongoTemplate.find(query, Map.class, "host");
            // split the hostMapList for the following multi-threads task
            // return the number of logical CPUs
            int processorsNum = Runtime.getRuntime().availableProcessors();
            // set the threadNum as 2*(the number of logical CPUs) for handling IO Tasks,
            // if Computing Tasks set the threadNum as (the number of logical  CPUs) + 1
            int threadNum = processorsNum * 2;  
            // the number of each group data 
            int eachGroupNum = hostMapList.size() / threadNum; 
            List<List<Map>> groupList = new ArrayList<>();
            for (int i = 0; i < threadNum; i++) {
                int start = i * eachGroupNum;
                if (i == threadNum - 1) {
                    int end = mapList.size();
                    groupList.add(hostMapList.subList(start, end));
                } else {
                    int end = (i+1) * eachGroupNum;
                    groupList.add(hostMapList.subList(start, end));
                }
            }
            // update data by using multi-threads asynchronously
            ExecutorService executorService = Executors.newFixedThreadPool(threadNum/2);
            CountDownLatch countDownLatch = new CountDownLatch(threadNum);
            for (List<Map> group : groupList) {
                executorService.execute(()->{
                    try {
                        for (Map map : group) {
                        	// update the data in mongodb
                        }
                    } catch (Exception e) {
                        e.printStackTrace();
                    } finally {
                    	// let counter minus one 
                        countDownLatch.countDown();  
                    }
                });
            }
            try {
            	// main thread donnot execute until all child threads finish
                countDownLatch.await();  
            } catch (Exception e) {
                e.printStackTrace();
            }
            // remember to shutdown the threadPool
            executorService.shutdown();  
            return ResponseData.success();
    }

    Then after using the multi-threaded asynchronous update strategy, The approximate time required to call the interface has dropped from 30-40 min to 8-10 min, greatly improving execution efficiency.

    It should be noted that the newFixedThreadPool used here to create a thread pool has a flaw that its blocking queue defaults to an unbounded queue, and the default value is Integer.MAX_VALUE is very likely to cause OOM problems. Therefore, you can generally use ThreadPoolExecutor to create a thread pool, and you can specify the number of threads in the waiting queue to avoid OOM problems.

    public ResponseData updateHostDept() {
    		// ...
    		List<Map> hostMapList = mongoTemplate.find(query, Map.class, "host");
            // split the hostMapList for the following multi-threads task
            // return the number of logical CPUs
            int processorsNum = Runtime.getRuntime().availableProcessors();
            // set the threadNum as 2*(the number of logical CPUs) for handling IO Tasks,
            // if Computing Tasks set the threadNum as (the number of logical  CPUs) + 1
            int threadNum = processorsNum * 2;  
            // the number of each group data 
            int eachGroupNum = hostMapList.size() / threadNum; 
            List<List<Map>> groupList = new ArrayList<>();
            for (int i = 0; i < threadNum; i++) {
                int start = i * eachGroupNum;
                if (i == threadNum - 1) {
                    int end = mapList.size();
                    groupList.add(hostMapList.subList(start, end));
                } else {
                    int end = (i+1) * eachGroupNum;
                    groupList.add(hostMapList.subList(start, end));
                }
            }
            // update data by using multi-threads asynchronously
            ThreadPoolExecutor executor = new ThreadPoolExecutor(5, 8, 30L, TimeUnit.SECONDS, 
                    new ArrayBlockingQueue<>(100));
            CountDownLatch countDownLatch = new CountDownLatch(threadNum);
            for (List<Map> group : groupList) {
                executor.execute(()->{
                    try {
                        for (Map map : group) {
                        	// update the data in mongodb
                        }
                    } catch (Exception e) {
                        e.printStackTrace();
                    } finally {
                    	// let counter minus one 
                        countDownLatch.countDown();  
                    }
                });
            }
            try {
            	// main thread donnot execute until all child threads finish
                countDownLatch.await();  
            } catch (Exception e) {
                e.printStackTrace();
            }
            // remember to shutdown the threadPool
            executor.shutdown();  
            return ResponseData.success();
    }

    In the above code, the number of core threads and the maximum number of threads are 5 and 8 respectively. They are not set to very large values, because if they are set to a large value, frequent interruptions between threads will occur. Context switching will also increase time consumption, but will not maximize the advantages of multi-threading. As for how to choose appropriate parameters, it needs to be determined based on the parameters of the machine and the type of task.

    Finally, if you want to obtain the number of CPU threads of the machine through non-coding methods, it is also very simple. In the Windows system, you can view the number of CPU threads through the Task Manager and select "Performance". , as shown in the picture below:

    How much does Java multi-threaded concurrent programming improve data processing efficiency?

    As you can see from the picture above, the cores in my machine are eight CPUs, but one physical CPU core can be simulated through hyper-threading technology into two logical CPU threads, so my machine supports 8 cores and 16 threads.

    The above is the detailed content of How much does Java multi-threaded concurrent programming improve data processing efficiency?. For more information, please follow other related articles on the PHP Chinese website!

    Statement
    This article is reproduced at:亿速云. If there is any infringement, please contact admin@php.cn delete
    带你搞懂Java结构化数据处理开源库SPL带你搞懂Java结构化数据处理开源库SPLMay 24, 2022 pm 01:34 PM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于结构化数据处理开源库SPL的相关问题,下面就一起来看一下java下理想的结构化数据处理类库,希望对大家有帮助。

    Java集合框架之PriorityQueue优先级队列Java集合框架之PriorityQueue优先级队列Jun 09, 2022 am 11:47 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于PriorityQueue优先级队列的相关知识,Java集合框架中提供了PriorityQueue和PriorityBlockingQueue两种类型的优先级队列,PriorityQueue是线程不安全的,PriorityBlockingQueue是线程安全的,下面一起来看一下,希望对大家有帮助。

    完全掌握Java锁(图文解析)完全掌握Java锁(图文解析)Jun 14, 2022 am 11:47 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于java锁的相关问题,包括了独占锁、悲观锁、乐观锁、共享锁等等内容,下面一起来看一下,希望对大家有帮助。

    一起聊聊Java多线程之线程安全问题一起聊聊Java多线程之线程安全问题Apr 21, 2022 pm 06:17 PM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于多线程的相关问题,包括了线程安装、线程加锁与线程不安全的原因、线程安全的标准类等等内容,希望对大家有帮助。

    Java基础归纳之枚举Java基础归纳之枚举May 26, 2022 am 11:50 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于枚举的相关问题,包括了枚举的基本操作、集合类对枚举的支持等等内容,下面一起来看一下,希望对大家有帮助。

    详细解析Java的this和super关键字详细解析Java的this和super关键字Apr 30, 2022 am 09:00 AM

    本篇文章给大家带来了关于Java的相关知识,其中主要介绍了关于关键字中this和super的相关问题,以及他们的一些区别,下面一起来看一下,希望对大家有帮助。

    Java数据结构之AVL树详解Java数据结构之AVL树详解Jun 01, 2022 am 11:39 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于平衡二叉树(AVL树)的相关知识,AVL树本质上是带了平衡功能的二叉查找树,下面一起来看一下,希望对大家有帮助。

    一文掌握Java8新特性Stream流的概念和使用一文掌握Java8新特性Stream流的概念和使用Jun 23, 2022 pm 12:03 PM

    本篇文章给大家带来了关于Java的相关知识,其中主要整理了Stream流的概念和使用的相关问题,包括了Stream流的概念、Stream流的获取、Stream流的常用方法等等内容,下面一起来看一下,希望对大家有帮助。

    See all articles

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    AI Hentai Generator

    AI Hentai Generator

    Generate AI Hentai for free.

    Hot Article

    R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
    3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
    R.E.P.O. Best Graphic Settings
    3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
    R.E.P.O. How to Fix Audio if You Can't Hear Anyone
    3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

    Hot Tools

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    WebStorm Mac version

    WebStorm Mac version

    Useful JavaScript development tools

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    SAP NetWeaver Server Adapter for Eclipse

    SAP NetWeaver Server Adapter for Eclipse

    Integrate Eclipse with SAP NetWeaver application server.

    Safe Exam Browser

    Safe Exam Browser

    Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.