Home  >  Article  >  Backend Development  >  Load balancing difficulties in multi-core programming_PHP tutorial

Load balancing difficulties in multi-core programming_PHP tutorial

WBOY
WBOYOriginal
2016-07-20 11:05:401441browse

In multi-core CPUs, if you want to fully utilize the performance of multiple CPUs, you must ensure that the tasks assigned to each CPU have a good load balance. Otherwise, some CPUs are running and other CPUs are idle, and the advantages of multi-core CPUs cannot be used.


There are usually two solutions to achieve a good load balancing, one is static load balancing and the other is dynamic load balancing.


1. Static load balancing


In static load balancing, you need to manually divide the program into multiple parts that can be executed in parallel, and you must ensure that the divided parts are It can be evenly distributed to run on each CPU, which means that the workload must be evenly distributed among multiple tasks to achieve a high acceleration factor.


Mathematically speaking, the static load balancing problem is an NP-complete problem. Richard M. Karp, Jeffrey D. Ullman, Christos H. Papadimitriou, M. Garey, D. Johnson and others have successively worked on The NP-completeness of the static load problem under several different constraints was demonstrated between 1972 and 1983.


Although the NP-completeness problem is a difficult problem in mathematics, it is not the difficult problem mentioned in the title, because NP-completeness problems can generally be solved by very effective approximation algorithms.


2. Dynamic load balancing


Dynamic load balancing is to allocate tasks during the running process of the program to achieve the purpose of load balancing. In actual situations, there are many problems that cannot be solved by static load balancing. For example, in a large loop, the number of loops is input from the outside, and the number of loops is not known in advance. In this case, it is difficult to implement the static load balancing division strategy. Load balancing.


The scheduling of tasks in dynamic load balancing is generally implemented by the system. Programmers can usually only choose the dynamic balancing scheduling strategy and cannot modify the scheduling strategy, because there are many inconsistencies in actual tasks. Due to certain factors, the scheduling algorithm cannot do a very good job, so dynamic load balancing may sometimes not meet the established load balancing requirements.


3. What is the problem with load balancing?


The problem of load balancing does not lie in the degree of load balancing, because even if there are some gaps in the task execution time allocated on each CPU, as the number of CPU cores increases, it can always be achieved The total execution time decreases, so that the acceleration factor increases with the increase in the number of CPU cores.


The difficulty of load balancing is that many parallel execution blocks in the program must be divided by programmers. Of course, when the number of CPU cores is small, such as dual-core or 4-core, this division is not Very difficult. But as the number of cores increases, the granularity of division will become increasingly finer. When the number of cores exceeds 16, programmers will probably go crazy over how to divide tasks. For example, if a piece of sequentially executed code is run on a 128-core CPU, it must be manually divided into 128 tasks. The difficulty of the division can be imagined.


The error in load division will amplify as the number of CPU cores increases. For example, a program that takes 16 time units is divided into 4 tasks for execution, and the average load execution time on each task is is 4 time units and the division error is 1 time unit, then the acceleration coefficient becomes 16/(4 1)=3.2, which is 80% of the acceleration coefficient 4 under ideal circumstances. But if it is run on a 16-core CPU, if the division error of a certain task is 0.5 time units, then the acceleration coefficient becomes 16/(1 0.5) = 10.67, which is only 66.7% of the ideal acceleration coefficient of 16 , if the number of cores increases further, the ratio of the acceleration coefficient to the ideal acceleration coefficient will decrease due to the amplification of errors.


The problem of load division is also reflected in the upgrade of CPU and software. For example, the load division on a 4-core CPU is balanced, but on an 8-core or 16-core CPU, the load may become It's unbalanced. The same goes for software upgrades. When the software adds functions, the load balance will be destroyed, and the load needs to be re-divided to achieve balance. This greatly increases the difficulty and trouble of software design.


If locks are used, some seemingly balanced loads may become unbalanced due to lock competition.


4. Load balancing strategies


For software with a small amount of calculation, it will run very fast even if it is placed on a single-core CPU, and the load balancing is done well The difference does not have much impact. In actual load balancing, software with a large amount of calculation and large scale needs to be considered. These software need to be load balanced on multiple cores to make better use of multiple cores to improve performance.


For large-scale software, the response strategy adopted in load balancing is to develop a macro-partitioning method of dividing parallel blocks, and divide it from the entire software system level, rather than targeting certain parts as in the traditional method. Programs and algorithms are used for parallel decomposition, because it is usually difficult to decompose local programs into more than dozens of tasks to run.


Another coping strategy is at the tool level, that is, compilation tools can assist manual decomposition of parallel blocks and find good decomposition solutions. Intel has made some efforts in this regard, but more efforts are needed to make the tools The function is more powerful to cope with the situation when the number of cores is large.


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/445099.htmlTechArticleIn a multi-core CPU, if you want to fully utilize the performance of multiple CPUs, you must ensure that it is allocated to each CPU. The tasks have a good load balancing. Otherwise some CPUs are running, others are...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn