Home > Article > Backend Development > Explore the function operation mechanism of PHP_PHP tutorial
In any language, functions are the most basic building blocks. What are the characteristics of PHP functions? How is function calling implemented? How is the performance of PHP functions? Any suggestions for usage? This article will try to answer these questions by analyzing the principles and combining them with actual performance tests, so as to better write PHP programs while understanding the implementation. At the same time, some common PHP functions will be introduced.
In PHP, if divided horizontally, functions are divided into two categories: user function (built-in function) and internal function (built-in function). The former are some functions and methods customized by users in the program, and the latter are various library functions provided by PHP itself (such as sprintf, array_push, etc.). Users can also write library functions through extension methods, which will be introduced later. For user function, it can be subdivided into function (function) and method (class method). In this article, these three functions will be analyzed and tested respectively.
How is a PHP function ultimately executed? What is the process like?
To answer this question, let’s first take a look at the process of executing the PHP code.
As you can see from the picture above, PHP implements a typical dynamic language execution process: after getting a piece of code, after going through stages such as lexical analysis and syntax analysis, the source program will be translated into instructions (opcodes). The ZEND virtual machine then executes these instructions in sequence to complete the operation. Php itself is implemented in C, so the functions ultimately called are all C functions. In fact, we can regard PHP as a software developed in C.
It is not difficult to see from the above description that the execution of functions in PHP is also translated into opcodes for calling. Each function call actually executes one or more instructions.
For each function, zend is described by the following data structure:
typedef union _zend_function { zend_uchar type; /* MUST be the first element of this struct! */ struct { zend_uchar type; /* never used */ char *function_name; zend_class_entry *scope; zend_uint fn_flags; union _zend_function *prototype; zend_uint num_args; zend_uint required_num_args; zend_arg_info *arg_info; zend_bool pass_rest_by_reference; unsigned char return_reference; } common; zend_op_array op_array; zend_internal_function internal_function; } zend_function; typedef struct _zend_function_state { HashTable *function_symbol_table; zend_function *function; void *reserved[ZEND_MAX_RESERVED_RESOURCES]; } zend_function_state;
Type indicates the type of function: user function, built-in function, overloaded function. Common contains the basic information of the function, including function name, parameter information, and function flags (ordinary functions, static methods, abstract methods).
Built-in functions are essentially real C functions. For each built-in function, PHP will expand into a function named zif_xxxx after final compilation. For example, our common sprintf corresponds to zif_sprintf at the bottom layer. When Zend is executing, if it finds a built-in function, it simply performs a forwarding operation.
Zend provides a series of APIs for calling, including parameter acquisition, array operations, memory allocation, etc. The parameters of the built-in function are obtained through the zend_parse_parameters method. For parameters such as arrays and strings, zend implements shallow copying, so this efficiency is very high. It can be said that for PHP built-in functions, their efficiency is almost the same as that of the corresponding C functions, with the only additional forwarding call.
Built-in functions are dynamically loaded in PHP through so. Users can also write corresponding so according to their own needs, which is what we often call extensions. ZEND provides a series of APIs for extension use.
Compared with built-in functions, user-defined functions implemented through PHP have completely different execution processes and implementation principles. As mentioned above, we know that PHP code is translated into opcodes for execution, and user functions are no exception. In fact, each function corresponds to a set of opcodes, and this set of instructions is saved in zend_function. Therefore, the call of the user function ultimately corresponds to the execution of a set of opcodes.
Save local variables and implement recursion: We know that function recursion is completed through the stack. In php, a similar method is used to achieve this. Zend assigns an active symbol table (active_sym_table) to each PHP function to record the status of all local variables in the current function. All symbol tables are maintained in the form of a stack. Whenever a function is called, a new symbol table is allocated and pushed onto the stack. When the call ends, the current symbol table is popped off the stack. This enables state preservation and recursion.
For stack maintenance, zend has optimized it here. Pre-allocate a static array of length N to simulate the stack. This method of simulating dynamic data structures through static arrays is also often used in our own programs. This method avoids the memory allocation caused by each call. destroy. ZEND just cleans the symbol table data on the top of the current stack at the end of the function call.
Because the length of the static array is N, once the function call level exceeds N, the program will not cause stack overflow. In this case, zend will allocate and destroy the symbol table, which will cause a lot of performance degradation. In zend, the current value of N is 32. Therefore, when we write PHP programs, it is best not to exceed 32 function call levels. Of course, if it is a web application, the function call level itself can be deep.
Transfer of parameters: Unlike the built-in function calling zend_parse_params to obtain parameters, the acquisition of parameters in user functions is completed through instructions. How many parameters a function has corresponds to how many instructions it has. Specific to implementation, it is ordinary variable assignment. It can be seen from the above analysis that compared with built-in functions, since the stack table is maintained by itself, and each instruction is executed as a C function, the performance of user functions will be relatively much worse. There will be a specific comparative analysis later. Therefore, if a function has a corresponding PHP built-in function, try not to rewrite the function yourself to implement it.
The execution principle of class methods is the same as that of user functions, and they are also translated into opcodes and called sequentially. Class implementation is implemented by zend using a data structure zend_class_entry, which stores some basic information related to the class. This entry is processed when PHP is compiled.
In the common of zend_function, there is a member called scope, which points to the zend_class_entry of the class corresponding to the current method. Regarding the object-oriented implementation in PHP, I will not give a more detailed introduction here. In the future, I will write a special article to detail the object-oriented implementation principle in PHP. As far as the function is concerned, the implementation principle of method is exactly the same as that of function, and its performance is similar in theory. We will make a detailed performance comparison later.
Count is a function we often use. Its function is to return the length of an array.
What is the complexity of the count function? A common saying is that the count function will traverse the entire array and find the number of elements, so the complexity is O(n). So is this actually the case?
Let’s go back to the implementation of count. Through the source code, we can find that for the count operation of the array, the final path of the function is zif_count-> php_count_recursive-> zend_hash_num_elements, and the behavior of zend_hash_num_elements is return ht->nNumOfElements. It can be seen that, This is an O(1) rather than O(n) operation. In fact, the array is a hash_table at the bottom of PHP. For the hash table, there is a special element nNumOfElements in zend to record the number of current elements, so for general count, this value is actually returned directly. From this, we draw the conclusion: count has a complexity of O(1) and has nothing to do with the size of the specific array.
What is the behavior of count when it is a non-array type variable? Returns 0 for unset variables, and 1 for int, double, string, etc.
Strlen is used to return the length of a string. So, what is his implementation principle?
We all know that strlen is an O(n) function in c, which will sequentially traverse the string until it encounters
In addition, when calling strlen for non-string type variables, it will first force the variable to a string and then find the length. This needs to be noted.isset and array_key_exists
array_push and array[]
rand and mt_rand
We all know that rand generates pseudo-random numbers. In C, you need to use srand to display the specified seed. But in php, rand will call srand once by default for you. Under normal circumstances, there is no need to explicitly call it yourself.
It should be noted that if you need to call srand under special circumstances, you must call it accordingly. That is to say, srand corresponds to rand, and mt_srand corresponds to srand. They must not be mixed, otherwise they will be invalid.
sort and usort
Both are implemented using standard quick sorting. For those who have sorting requirements, unless there are special circumstances, just call these methods provided by PHP. There is no need to re-implement it yourself, and the efficiency will be much lower. The reason can be seen in the previous analysis and comparison of user functions and built-in functions.
Both of these are used for URL encoding. All non-alphanumeric characters in the string except -_. will be replaced with a percent sign (%) followed by two hexadecimal digits. The only difference between the two is that for spaces, urlencode will encode it as +, while rawurlencode will encode it as %20.
Generally, except for search engines, our strategy is to encode spaces as %20. Therefore, the latter is mostly used. Note that the encode and decode series must be used together.
This series of functions include strcmp, strncmp, strcasecmp, strncasecmp, and their implementation functions are the same as C functions. But there are differences, since php strings are allowed
In addition, since PHP can directly obtain the string length, it will check this aspect first, and in many cases the efficiency will be much higher.is_int and is_numeric
Is_int: Determine whether a variable type is an integer type. There is a field in the PHP variable to represent the type, so you can directly determine this type. It is an absolute O(1) operation.
Is_numeric: Determine whether a variable is an integer or a numeric string. That is to say, in addition to integer variables will return true, for string variables, if they are in the form of "1234", "1e4", etc., they will also be judged as true. At this time, the string will be traversed for judgment.