Home  >  Article  >  Backend Development  >  Watch the performance battle between PHP7 and HHVM

Watch the performance battle between PHP7 and HHVM

coldplay.xixi
coldplay.xixiforward
2020-06-24 17:47:023496browse


Watch the performance battle between PHP7 and HHVM

Recently, the performance comparison between PHP7 and HHVM has become a hot and controversial topic. Everyone is discussing and paying attention to which one is better. It is the future of PHP performance improvement.

The origin of HHVM (HipHop Virtual Machine)

HHVM is an open source PHP virtual machine that uses JIT compilation and other technologies to greatly improve the execution performance of PHP code. It is rumored that the execution performance of the current version of native PHP code can be improved by 5-10 times.

HHVM originated from Facebook. Many of Facebook's early codes were developed using PHP. However, with the rapid development of business, PHP execution efficiency has become an increasingly obvious problem. In order to optimize execution efficiency, Facebook began to use HipHop in 2008, which is a PHP execution engine. It was originally designed to convert Facebook's large amount of PHP code into C to improve performance and save resources. The performance of PHP code using HipHop is improved several times. Later, Facebook open sourced the HipHop platform and gradually developed it into the current HHVM.

1. Why is PHP slow?

PHP is slower than C/C level languages. In fact, the original design of the PHP language was not used to solve computing-intensive application scenarios. We can roughly understand that PHP sacrifices execution efficiency in order to improve development efficiency.

We know that a big feature of PHP is the weak type feature. That is to say, I can define a variable at will and then assign it to various types of data at will. Take an int integer number as an example, in C language:

int num = 200; // Usually 4 bytes

However, if PHP defines the same variable, The actual corresponding storage structure is:

This structure will occupy much more memory than C variables. It is defined in PHP as follows:

$a = 200;//This variable will actually occupy many times the storage space compared to the C variable.

In fact, for PHP, no matter what type of data is stored, it is implemented using the above-mentioned "killing" structure. In order to be compatible with PHP programmers' variable type "intrusion", PHP has been friendly to developers, but cruel to the execution engine. The memory consumption of a single variable may not be obvious yet. Once PHP arrays are used, the complexity increases exponentially (the implementation of arrays is HashTable). Then, when the Zend engine is executed, these PHP codes are compiled into opcode (PHP's intermediate bytecode, the format is somewhat similar to assembly), which is interpreted and executed line by line by the Zend engine.

Whether it is the connection operation of strings or the simple modification of arrays, it is almost the rhythm of "one word from a PHP programmer, and the Zend engine will break its legs". Therefore, for the same operation, PHP consumes more system resources such as CPU and memory than C. In addition, there are automatic memory recycling, variable type judgment, etc., which will increase the consumption of system resources.

For example, I used the quick sort function and the native sort function implemented in pure PHP to sort 10,000 integer numbers to make a time-consuming comparison. The results are as follows:

The native sort takes 3.44 ms, while the PHP function sort we implemented ourselves takes 68.79 ms. We found that there is a huge gap in execution efficiency between the two. My way of testing is to calculate the time interval before and after the function is executed, not the time from start to end of the entire PHP script. The PHP script startup and shutdown process itself involves a series of initialization and cleanup work, which also takes up a lot of time.

Normally, the ranking of PHP execution efficiency is:

  1. The fastest is the PHP language structure (isset, echo, etc.), PHP part of the language (they are not functions at all).
  2. Then the faster ones are PHP’s native and extended functions. PHP extension, based on Zend API, functions implemented in C, and the execution efficiency is of the same order of magnitude as C/Java.
  3. What is really slow is the code and functions we write ourselves through PHP. For example, if we use a relatively heavy framework implemented in pure PHP, because the framework itself has many modules, it will obviously drag down the execution efficiency at the language level and occupy more memory. (The domestic Yaf framework is implemented in an expanded manner, so the execution efficiency is much faster than the framework written in pure PHP)

Under normal circumstances, we do not recommend using PHP to implement complex logical calculation functions, especially in scenarios where the Web system traffic is relatively large. Therefore, PHP programmers should have a relatively broad understanding of PHP's various native functions and various extensions. In specific function implementation scenarios, seek more native solutions (native interfaces or extensions) instead of writing one by themselves. Stack of complex PHP code to implement this type of functionality.

If you have enough PHP extension development capabilities, rewriting this type of business function as a PHP extension will also greatly improve the execution efficiency of the code. This is a very good method and is also widely used in PHP optimization. However, the shortcomings of self-written PHP business development are also obvious:

  1. Expansion development takes a long time, and modifications are complicated when requirements change. Poor writing may affect the stability of Web services. (For example, in Apache's worker mode, if it hangs in a multi-threaded scenario, it will affect other normal sub-threads in the same process. If it is a multi-threaded Web mode, the writing extension needs to support thread safety)
  2. Expansion When upgrading the PHP version, additional compatibility work may be required.
  3. The maintenance and takeover costs after personnel changes are also relatively high.

In fact, among front-line Internet companies, the more common solution is not to add PHP extensions, but to use C/C to independently write a service server, and then PHP communicates with the service server through sockets. Business processing does not couple PHP itself with the business.

However, most of the performance bottlenecks of Web services are in the time-consuming of network transmission and other service servers (such as MySQL, etc.). The time-consuming of PHP execution accounts for a very small proportion of the overall time-consuming, so from From a business perspective, the impact may not be obvious.

2. The way HHVM improves PHP execution performance

The way HHVM improves PHP performance is to replace the Zend engine to generate and Execute PHP's intermediate bytecode (HHVM generates its own format of intermediate bytecode), and execute it through JIT (Just In Time, Just in time compilation is a software optimization technology, which refers to The bytecode will be compiled into machine code at runtime) and converted into machine code for execution. The default approach of the Zend engine is to first compile it into opcode and then execute it one by one. Usually each instruction corresponds to a C language level function. If we generate a large number of repeated opcodes (codes and functions written in pure PHP), Zend will execute these C codes one by one multiple times. What JIT does is to go one step further and compile a large number of repeatedly executed bytecodes into machine code at runtime to improve execution efficiency. Usually, the condition that triggers JIT is that the code or function is called multiple times.

Ordinary PHP code, because the type of the variable cannot be fixed, additional logic code to determine the type needs to be added. Such PHP code is not conducive to CPU execution and optimization. Therefore, HHVM usually needs to use PHP code with Hack writing method (additional technical code added to be compatible with certain features) to "cooperate", in order to fix the variable type and facilitate virtual machine compilation and execution. PHP pursues to accommodate all types in one form, while Hack can mark everything accommodated with a certain type.

Examples of Hack writing in PHP code:

The above example , PHP code is mainly added with variable types. The overall direction of Hack writing is to change the previous "dynamic" writing method into a "static" writing method to cooperate with HHVM.

HHVM has attracted a lot of attention because of its high performance, and some first-tier Internet companies have also begun to follow suit. Judging from the pure language execution performance test results, HHVM is much ahead of the PHP7 version under development.

However, from the perspective of specific business scenarios, the gap between HHVM and PHP7 is not that big. In the results of the WordPress open source blog homepage as the test scenario, their current gap Not obvious.

However, PHP7 is still under development. Judging from the available technical solutions, the current HHVM is slightly better. However, there are some problems in the deployment and application of HHVM:

  1. Service deployment is complicated and has certain maintenance costs.
  2. It does not fully support PHP native code, and PHP extensions also need to be properly compatible.
  3. HHVM is a new virtual machine and may cause memory leaks when running for a long time. (It is said that when first-tier Internet companies apply this technology, they solve memory leaks by patching themselves)

HHVM is, after all, a relatively new open source project. It still takes some time to mature.

Performance Innovation of PHP7

The performance issues that PHP has long been criticized for will be greatly improved in this version. There is no PHP6 in the middle of the version. It is said that this version has a project, and later most of the functions were implemented in the 5.x version. In order to avoid confusion, the next major version will be PHP7 directly. (A few years ago, I also saw books about PHP6.)

1. Introduction to PHP7

Although the official version of PHP7 It may not be released until October 2015, but a test version should be available in June next year, followed by 3-4 months of quality assurance.

The project plan of the PHP community is as follows:

Because the project is still in The reason is that it is under development. From the table, the feature descriptions that can be seen are relatively vague. There are definitely more other features, they just haven't been announced yet. The following are from the PHP community. Because PHP7 is a project under development, the following may not be accurate, but it does not prevent us from taking a look.

  1. PHPNG (PHP next generation, next generation PHP), various performance optimizations for the Zend execution engine itself, among which JIT may be implemented in the Zend Opcache component.
  2. AST (Abstract Syntax Tree, Abstract Syntax Tree) aims to introduce a middleware into the PHP compilation process to replace the way of spitting out opcode directly from the interpreter. Decoupling the interpreter and compiler can reduce a lot of Hack code, and at the same time, make the implementation easier to understand and maintain.
  3. Uniform variable syntax (unified variable syntax) introduces an internally consistent and complete variable syntax, allowing the PHP parser to more fully support various types of variables. The usage of some variables needs to be adjusted, such as variable $$a, etc.
  4. Support integer semantics (integer semantics), such as NaN, Infinity, <<, >>, correct the consistency of list(), etc.

Among the above features, the most anticipated is the performance optimization of PHPng. The PHP community has released some performance speed test data. From the data point of view, the execution performance of PHPng has been nearly doubled compared to the beginning of the project. This result is already very good. Moreover, the most important thing is that there are still many optimization plans for PHP7 that have not yet been completed. When everything is completed, I believe we can see a PHP7 with higher performance.

This speed test data comes from the PHP community (wiki.php.net/phpng), and intercepts part of the data:

For the current PHP5. Version 6, PHPNG’s performance improvement in October has been very obvious:

Simple translation:

  • The comprehensive test speed increased by 35%.
  • In actual application scenarios, there is a 20%-70% speed increase (the WordPress homepage has a 60% improvement)
  • Less memory consumption
  • Supports most commonly used SAPIs
  • Support most PHP extensions bound to resource allocation (69 completed, 6 to be migrated)
  • Provide execution speed comparable to HHVM3.3.0

2. PHP’s weak type controversy

PHP has many controversial features, but with the release and improvement of the language version, there have been criticisms in terms of functions and features. It started to decrease. However, PHP's "weak type" feature has obviously been more controversial. From the fact that HHVM directly "removed" the "weak type" feature through Hack, it can be seen that HHVM does not like the "weak type" feature. However, in the eyes of many of us PHP programmers, this is one of the important advantages of PHP. Variables in PHP are designed to be casual and elegant, embracing everything. Doesn't it make the language simpler?

In fact, some people think it is a serious problem. The criticism of "weak typing" is roughly as follows:

  1. In "strict" languages, it is usually predefined The type of a variable is fixed from beginning to end, and the scope of use is also fixed. As for PHP variables, usually we can only see their names, and most types cannot be predefined, and they can be changed at will. (Memory allocation is not easy to manage)
  2. In order to be compatible with weak type features, PHP needs to implement a large number of compatible codes, including type judgment, type conversion, storage methods, etc., which increases the internal complexity of the language. (Low execution efficiency)
  3. The type of variables is uncontrollable. There are a large number of "implicit type conversions" during the execution process, which can easily produce unpredictable results. (It does need to be emphasized here that PHP type conversion is a point that must be mastered. Converting various types to each other may cause many problems, especially for students who are new to PHP)

They believe that these are not in line with the simplicity of "what you see is what you get", and that languages ​​with strict grammar are more efficient and easier to "understand".

Languages ​​such as Javascript have also been similarly criticized because their performance on this issue is the same. However, if a language is eventually used on a large scale, it must have its reasons. PHP has become the scripting language of choice for Web service development, and Javascript has directly dominated the Web front-end field. It cannot be an accident that it has reached this point. Developers voted for them with their feet. Programming language is a bridge between humans and machines, and the ultimate pursuit is to achieve the grand goal of “everyone can program”.

Throughout the history of language development, we started from the machine code of 0s and 1s, to assembly language, then to C language, and then to the dynamic scripting language PHP. The execution efficiency decreases exponentially, but the learning threshold also decreases exponentially. The PHP language not only shields the complexity of C's memory management and pointers, but also further shields the complexity of variable types. It improves the efficiency of project development and lowers the threshold for learning, but at the same time sacrifices a certain amount of execution performance. Then, HHVM's Hack gives us a "return to primitive" feeling, reintroducing the complexity of variables. Of course, different languages ​​solve problems in different scenarios and cannot be generalized.

Summary

HHVM’s performance improvements to PHP are impressive, while PHP7’s hard-working Very much looking forward to it. Both are excellent open source projects and both are constantly moving forward and developing. For now, because it is still a long time before the official version of PHP7 is released, the current choice of performance optimization solution is of course HHVM. However, personally, I am more optimistic about PHP7 because it is more backward compatible with PHP code. If there is not much difference in performance between the two, I will choose the simpler one.

Recommended tutorial: "php video tutorial"

The above is the detailed content of Watch the performance battle between PHP7 and HHVM. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete