Home > Article > Backend Development > PHP execution principle
PHP Execution Principle
php is a language with very simple application and extremely high development efficiency. Its weakly typed variables can save programmers a lot of definition of variables, The time and effort of type conversion etc. It is a dynamic language suitable for web development.
Multi-process model: This can ensure that processes are not affected by each other, and the resource utilization of processes is faster and more convenient
Weakly typed language: Unlike strongly typed languages such as C, C++, and Java, the type of variables in PHP is not determined at the beginning. It is determined at runtime. , it can be type-converted implicitly or explicitly, which makes it very flexible in development. Programmers do not need to pay attention to variable types.
Zend Engine + Component ( ext) mode reduces internal coupling
The middle layer (sapi) isolates the web server and PHP
The syntax is simple and flexible, with few specifications. This has pros and cons. . .
php has a total of four-layer system from top to bottom:
Zend Engine: Zend is implemented entirely in C and is the core part of PHP. It translates PHP code into executable opcode, processes and implements corresponding processing methods (principle: Niao Ge's blog), implements basic data structures, memory allocation and management, and provides corresponding API methods for external use, which is the core of everything.
Extensions: Around the Zend engine, extensions provide various basic services through components. Commonly used built-in function arrays, standard libraries, etc. are all provided through extensions. Users can also implement their own extensions as needed to achieve functional expansion and other purposes. For example, the PHP middle layer and rich text parsing currently used by Tieba are typical applications of extensions).
Sapi: Sapi’s full name is Server Application Programing Interface, which is the server application programming interface. Sapi enables PHP to interact with peripheral data through a series of hook functions. This is a very elegant and successful design of PHP. Through sapi, PHP itself is successfully decoupled and isolated from the upper-layer application. PHP can no longer consider how to be compatible with different applications, and the application itself can also implement different processing according to its own characteristics. Way.
Upper-layer application: This is the application program written by programmers. Various application modes are obtained through different sapi methods, such as implementing web applications through webserver , run as a script on the command line, etc.
As mentioned before, Sapi enables external applications to interact with php through a series of interfaces Data can be exchanged and specific processing methods can be implemented according to different application characteristics. Common sapis are:
apache2handler: Using apache as the webserver, the processing method when running in MOD_PHP mode is also now The most widely used one
cgi: This is another way of interaction between webserver and php, which is the fastcgi protocol
cli: Command debugging application mode
As can be seen from the figure, PHP is implemented through the Zend engine A typical dynamic language execution process is described: a piece of code is obtained, and after lexical analysis, syntax analysis and other stages, the source program is translated into instructions (opcodes), and then the Zend virtual machine executes these instructions sequentially. PHP itself is implemented in C language, so the functions ultimately called are also C language functions.
The core of PHP execution is the translated instructions one by one, that is, opcode
Opcode is the most basic unit of PHP program execution. An opcode consists of two parameters (op1, op2), return value and processing function. The PHP program is ultimately translated into the sequential execution of a set of opcode processing functions.
Several commonly used functions:
END_ASSIGN_SPEC_CV_CV_HANDLER: variable allocation (a=b)
Zend hash table implements the typical hash table hash structure, and at the same time provides the functions of forward, reverse, and array traversal by attaching a doubly linked list. The structure is as follows:
As you can see, the hash table has both a hash structure in the form of key->value and a doubly linked list mode, making it very convenient to support fast search and linear traversal.
Hash structure: Zend’s hash structure is a typical hash table model, which resolves conflicts through a linked list. It should be noted that zend's hash table is a self-growing data structure. When the number of hash tables is full, it will dynamically expand by 2 times and reposition elements. The initial size is 8. In addition, when performing key->value fast search, zend itself has also made some optimizations to speed up the process by exchanging space for time. For example, a variable nKeyLength is used in each element to identify the length of the key for quick determination.
Doubly linked list: Zend hash table implements linear traversal of elements through a linked list structure. Theoretically, it is enough to use a one-way linked list for traversal. The main purpose of using a doubly linked list is to quickly delete and avoid traversal. Zend hash table is a composite structure. When used as an array, it supports common associative arrays and can also be used as sequential index numbers, and even allows a mixture of the two.
PHP Associative Array: Associative array is a typical hash_table application. A query process goes through the following steps (as can be seen from the code, this is a common hash query process and some quick judgments are added to speed up the search):
01 getKeyHashValue h; 02 index = n & nTableMask; 03 Bucket *p = arBucket[index]; 04 while (p) { 05 if ((p->h == h) && (p->nKeyLength == nKeyLength)) { 06 RETURN p->data; 07 } 08 p=p->next; 09 } 10 RETURN FALTURE;
PHP index array: Index array It is our common array, accessed through subscripts. For example, arr[0], Zend HashTable has been normalized internally, and the hash value and nKeyLength(is 0) are also assigned to the index type key. The internal member variable nNextFreeElement is the currently assigned maximum id, which is automatically increased by one after each push. It is this normalization process that allows PHP to achieve a mixture of associative and non-associative data. Due to the particularity of the push operation, the order of the index keys in the PHP array is not determined by the size of the subscript, but by the order of the push. For example, arr[1] = 2; arr[2] = 3; for double type key, Zend HashTable will treat it as an index key
5.2 Implementation principle of PHP variables
PHP is a weakly typed language and does not strictly distinguish the types of variables. PHP variables can be divided into simple types (int, sting, bool), collection types (array, resource, object) and constants (const). All variables have the same structure at the bottom zval
Zval is a very important data structure in zend. It is used to mark and implement PHP variables. Its data structure is as follows:
struct _zval_struct { zvalue_value value; /* value */ zend_uint refcount__gc; /* variable ref count */ zend_uchar type; /* active type */ zend_uchar is_ref__gc; /* if it is a ref variable */ }; typedef struct _zval_struct zval;
Among them,
zval_value value is the actual value of the variable, specifically a zvalue_value union:
typedef union _zvalue_value { long lval; /* long value */ double dval; /* double value */ struct { /* string */ char *val; int len; } str; HashTable *ht; /* hash table value,used for array */ zend_object_value obj; /* object */ } zvalue_value;
zend_uint refcount__gc is A counter used to save how many variables (or symbols, symbols) point to this zval. When a variable is generated, its refcount=1. Typical assignment operations such as $a = $b will increase the refcount of zval by 1, and the unset operation will decrease it by 1 accordingly. Before PHP5.3, the reference counting mechanism was used to implement GC. If the refcount of a zval was less than 0, then the Zend engine would think that there was no variable pointing to the zval, so it would release the memory space occupied by the zval. But, sometimes things are not that simple. We will see later that the simple reference counting mechanism cannot GC the circularly referenced zval, even if the variable pointing to the zval has been unset, resulting in a memory leak (Memory Leak).
zend_uchar typeThis field is used to indicate the actual type of the variable. Variables in PHP include four scalar types (bool, int, float, string), two composite types (array, object) and two special types (resource and NULL). Inside zend, these types correspond to the following macro (code location phpsrc/Zend/zend.h)
#define IS_NULL 0 #define IS_LONG 1 #define IS_DOUBLE 2 #define IS_BOOL 3 #define IS_ARRAY 4 #define IS_OBJECT 5 #define IS_STRING 6 #define IS_RESOURCE 7 #define IS_CONSTANT 8 #define IS_CONSTANT_ARRAY 9 #define IS_CALLABLE 10
is_ref__gcThis field Used to mark whether a variable is a reference variable. For ordinary variables, the value is 0, and for reference variables, the value is 1. This variable will affect the sharing, separation, etc. of zval
Integer and floating-point number are one of the basic types in PHP and are also simple variables. For integers and floating point numbers, the corresponding values are stored directly in zvalue. Its types are long and double respectively.
As can be seen from the zvalue structure, for integer types, unlike strongly typed languages such as c, PHP does not distinguish between int, unsigned int, long, long long and other types. For it, integers only One type is long. From this, it can be seen that in PHP, the value range of integers is determined by the number of compiler bits and is not fixed. What happens if an integer goes out of bounds in php? PHP will automatically convert integers into floating-point number types
For floating-point numbers, similar to integers, it does not distinguish between float and double, but only unified one type: double
Like integers, character variables are also basic types and simple variables in PHP. It can be seen from the zvalue structure that in PHP, a string is composed of a pointer to the actual data and a length structure, which is similar to the string in C++. Since the length is represented by an actual variable, unlike c, its string can be binary data (including 0). At the same time, in PHP, finding the string length strlen is an O(1) operation
Common string splicing methods and speed comparison:
Assume there are the following 4 variables: strA='123'; strB = '456'; intA=123; intB=456;
Now make a comparison and explanation of the following string splicing methods:
1 res = strA.strB and res = "strAstrB"
In this case, zend will re-malloc a piece of memory and process it accordingly. , its speed is average.
2 strA = strA.strB
This is the fastest, zend will directly relloc based on the current strA to avoid repeated copies
3 res = intA.intB
This is slower , because implicit format conversion is required, you should also pay attention to avoid it when writing programs
4 strA = sprintf (“%s%s”, strA, strB);
This will be the slowest one method, because sprintf is not a language structure in PHP, and it takes a lot of time to identify and process the format. In addition, the mechanism itself is malloc. However, the sprintf method is the most readable, and in practice it can be chosen flexibly according to specific circumstances.
PHP arrays are naturally implemented through Zend Hash Table.
How to implement the foreach operation? Foreach of an array is completed by traversing the doubly linked list in the hashtable. For index arrays, traversal through foreach is much more efficient than for, eliminating the need to search for key->value. The count operation directly calls HashTable->NumOfElements, O(1) operation. For a string like '123', zend will convert it to its integer form. arr[‘123’] and arr[123]
Reference counting is widely used in memory recycling, string operations, etc. Zval's reference counting is implemented through the member variables is_ref and ref_count. Through reference counting, multiple variables can share the same data. Avoid the huge consumption caused by frequent copying. During the assignment operation, zend points the variable to the same zval and ref_count++, and during the unset operation, the corresponding ref_count-1. The destruction operation will only be performed when ref_count is reduced to 0. If it is a reference assignment, zend will modify is_ref to 1.
PHP variables realize variable sharing data through reference counting. What if you change the value of one of the variables? When trying to write a variable, if Zend finds that the zval pointed to by the variable is shared by multiple variables, it will copy a zval with a ref_count of 1 and decrement the refcount of the original zval. This process is called "zval separation". It can be seen that zend only performs copy operations when a write operation occurs, so it is also called copy-on-write (copy on write)
For reference variables, the requirements are the same as non-reference On the contrary, variables assigned by reference must be bundled. Modifying one variable modifies all bundled variables.
How are local variables and global variables implemented in PHP? For a request, PHP can see two symbol tables (symbol_table and active_symbol_table) at any time, of which the former is used to maintain global variables. The latter is a pointer pointing to the currently active variable symbol table. When the program enters a function, zend will allocate a symbol table x to it and point active_symbol_table to a. In this way, the distinction between global and local variables is achieved.
Get variable values: PHP's symbol table is implemented through hash_table. Each variable is assigned a unique identifier. When obtaining, the corresponding zval is found from the table according to the identifier and returned.
Using global variables in functions: In functions, we can use global variables by explicitly declaring global. Create a reference to the variable with the same name in symbol_table in active_symbol_table (if the value of the reference variable needs to be updated, everyone will update it together). If there is no variable with the same name in symbol_table, it will be created first.
Reference:
http://www.php.cn/
http://www.php.cn/
[http://www.php.cn/
PHP Execution Principle
php is a language with very simple application and extremely high development efficiency. Its weakly typed variables can save programmers a lot of definition of variables, The time and effort of type conversion etc. It is a dynamic language suitable for web development.
Multi-process model: This can ensure that processes are not affected by each other, and the resource utilization of processes is faster and more convenient
Weakly typed language: Unlike strongly typed languages such as C, C++, and Java, the type of variables in PHP is not determined at the beginning. It is determined at runtime. , it can be type-converted implicitly or explicitly, which makes it very flexible in development. Programmers do not need to pay attention to variable types.
Zend Engine + Component ( ext) mode reduces internal coupling
The middle layer (sapi) isolates the web server and PHP
The syntax is simple and flexible, with few specifications. This has pros and cons. . .
php has a total of four-layer system from top to bottom:
Zend Engine: Zend is implemented entirely in C and is the core part of PHP. It translates PHP code into executable opcode, processes and implements corresponding processing methods (principle: Niao Ge's blog), implements basic data structures, memory allocation and management, and provides corresponding API methods for external use, which is the core of everything.
Extensions: Around the Zend engine, extensions provide various basic services through components. Commonly used built-in function arrays, standard libraries, etc. are all provided through extensions. Users can also implement their own extensions as needed to achieve functional expansion and other purposes. For example, the PHP middle layer and rich text parsing currently used by Tieba are typical applications of extensions).
Sapi: Sapi’s full name is Server Application Programing Interface, which is the server application programming interface. Sapi enables PHP to interact with peripheral data through a series of hook functions. This is a very elegant and successful design of PHP. Through sapi, PHP itself is successfully decoupled and isolated from the upper-layer application. PHP can no longer consider how to be compatible with different applications, and the application itself can also implement different processing according to its own characteristics. Way.
Upper-layer application: This is the application program written by programmers. Various application modes are obtained through different sapi methods, such as implementing web applications through webserver , run as a script on the command line, etc.
As mentioned before, Sapi enables external applications to interact with php through a series of interfaces Data can be exchanged and specific processing methods can be implemented according to different application characteristics. Common sapis are:
apache2handler: Using apache as the webserver, the processing method when running in MOD_PHP mode is also now The most widely used one
cgi: This is another way of interaction between webserver and php, which is the fastcgi protocol
cli: Command debugging application mode
As can be seen from the figure, PHP is implemented through the Zend engine A typical dynamic language execution process is described: a piece of code is obtained, and after lexical analysis, syntax analysis and other stages, the source program is translated into instructions (opcodes), and then the Zend virtual machine executes these instructions sequentially. PHP itself is implemented in C language, so the functions ultimately called are also C language functions.
The core of PHP execution is the translated instructions one by one, that is, opcode
Opcode is the most basic unit of PHP program execution. An opcode consists of two parameters (op1, op2), return value and processing function. The PHP program is ultimately translated into the sequential execution of a set of opcode processing functions.
Several commonly used functions:
END_ASSIGN_SPEC_CV_CV_HANDLER: variable allocation (a=b)
Zend hash table implements the typical hash table hash structure, and at the same time provides the functions of forward, reverse, and array traversal by attaching a doubly linked list. The structure is as follows:
As you can see, the hash table has both a hash structure in the form of key->value and a doubly linked list mode, making it very convenient to support fast search and linear traversal.
Hash structure: Zend’s hash structure is a typical hash table model, which resolves conflicts through a linked list. It should be noted that zend's hash table is a self-growing data structure. When the number of hash tables is full, it will dynamically expand by 2 times and reposition elements. The initial size is 8. In addition, when performing key->value fast search, zend itself has also made some optimizations to speed up the process by exchanging space for time. For example, a variable nKeyLength is used in each element to identify the length of the key for quick determination.
Doubly linked list: Zend hash table implements linear traversal of elements through a linked list structure. Theoretically, it is enough to use a one-way linked list for traversal. The main purpose of using a doubly linked list is to quickly delete and avoid traversal. Zend hash table is a composite structure. When used as an array, it supports common associative arrays and can also be used as sequential index numbers, and even allows a mixture of the two.
PHP Associative Array: Associative array is a typical hash_table application. A query process goes through the following steps (as can be seen from the code, this is a common hash query process and some quick judgments are added to speed up the search):
01 getKeyHashValue h; 02 index = n & nTableMask; 03 Bucket *p = arBucket[index]; 04 while (p) { 05 if ((p->h == h) && (p->nKeyLength == nKeyLength)) { 06 RETURN p->data; 07 } 08 p=p->next; 09 } 10 RETURN FALTURE;
PHP index array: Index array It is our common array, accessed through subscripts. For example, arr[0], Zend HashTable has been normalized internally, and the hash value and nKeyLength(is 0) are also assigned to the index type key. The internal member variable nNextFreeElement is the currently assigned maximum id, which is automatically increased by one after each push. It is this normalization process that allows PHP to achieve a mixture of associative and non-associative data. Due to the particularity of the push operation, the order of the index keys in the PHP array is not determined by the size of the subscript, but by the order of the push. For example, arr[1] = 2; arr[2] = 3; for double type key, Zend HashTable will treat it as an index key
5.2 Implementation principle of PHP variables
PHP is a weakly typed language and does not strictly distinguish the types of variables. PHP variables can be divided into simple types (int, sting, bool), collection types (array, resource, object) and constants (const). All variables have the same structure at the bottom zval
Zval is a very important data structure in zend. It is used to mark and implement PHP variables. Its data structure is as follows:
struct _zval_struct { zvalue_value value; /* value */ zend_uint refcount__gc; /* variable ref count */ zend_uchar type; /* active type */ zend_uchar is_ref__gc; /* if it is a ref variable */ }; typedef struct _zval_struct zval;
Among them,
zval_value value is the actual value of the variable, specifically a zvalue_value union:
typedef union _zvalue_value { long lval; /* long value */ double dval; /* double value */ struct { /* string */ char *val; int len; } str; HashTable *ht; /* hash table value,used for array */ zend_object_value obj; /* object */ } zvalue_value;
zend_uint refcount__gc is A counter used to save how many variables (or symbols, symbols) point to this zval. When a variable is generated, its refcount=1. Typical assignment operations such as $a = $b will increase the refcount of zval by 1, and the unset operation will decrease it by 1 accordingly. Before PHP5.3, the reference counting mechanism was used to implement GC. If the refcount of a zval was less than 0, then the Zend engine would think that there was no variable pointing to the zval, so it would release the memory space occupied by the zval. But, sometimes things are not that simple. We will see later that the simple reference counting mechanism cannot GC the circularly referenced zval, even if the variable pointing to the zval has been unset, resulting in a memory leak (Memory Leak).
zend_uchar typeThis field is used to indicate the actual type of the variable. Variables in PHP include four scalar types (bool, int, float, string), two composite types (array, object) and two special types (resource and NULL). Inside zend, these types correspond to the following macro (code location phpsrc/Zend/zend.h)
#define IS_NULL 0 #define IS_LONG 1 #define IS_DOUBLE 2 #define IS_BOOL 3 #define IS_ARRAY 4 #define IS_OBJECT 5 #define IS_STRING 6 #define IS_RESOURCE 7 #define IS_CONSTANT 8 #define IS_CONSTANT_ARRAY 9 #define IS_CALLABLE 10
is_ref__gcThis field Used to mark whether a variable is a reference variable. For ordinary variables, the value is 0, and for reference variables, the value is 1. This variable will affect the sharing, separation, etc. of zval
Integer and floating-point number are one of the basic types in PHP and are also simple variables. For integers and floating point numbers, the corresponding values are stored directly in zvalue. Their types are long and double respectively.
As can be seen from the zvalue structure, for integer types, unlike strongly typed languages such as c, PHP does not distinguish between int, unsigned int, long, long long and other types. For it, integers only One type is long. From this, it can be seen that in PHP, the value range of integers is determined by the number of compiler bits and is not fixed. What happens if an integer goes out of bounds in php? PHP will automatically convert integers into floating-point number types
For floating-point numbers, similar to integers, it does not distinguish between float and double, but only unified one type: double
Like integers, character variables are also basic types and simple variables in PHP. It can be seen from the zvalue structure that in PHP, a string is composed of a pointer to the actual data and a length structure, which is similar to the string in C++. Since the length is represented by an actual variable, unlike c, its string can be binary data (including 0). At the same time, in PHP, finding the string length strlen is an O(1) operation
Common string splicing methods and speed comparison:
Assume there are the following 4 variables: strA='123'; strB = '456'; intA=123; intB=456;
Now make a comparison and explanation of the following string splicing methods:
1 res = strA.strB and res = "strAstrB"
In this case, zend will re-malloc a piece of memory and process it accordingly. , its speed is average.
2 strA = strA.strB
This is the fastest, zend will directly relloc based on the current strA to avoid repeated copies
3 res = intA.intB
This is slower , because implicit format conversion is required, you should also pay attention to avoid it when writing programs
4 strA = sprintf (“%s%s”, strA, strB);
This will be the slowest one method, because sprintf is not a language structure in PHP, and it takes a lot of time to identify and process the format. In addition, the mechanism itself is malloc. However, the sprintf method is the most readable, and in practice it can be chosen flexibly according to specific circumstances.
PHP arrays are naturally implemented through Zend Hash Table.
How to implement the foreach operation? Foreach of an array is completed by traversing the doubly linked list in the hashtable. For index arrays, traversal through foreach is much more efficient than for, eliminating the need to search for key->value. The count operation directly calls HashTable->NumOfElements, O(1) operation. For a string like '123', zend will convert it to its integer form. arr[‘123’] and arr[123]
Reference counting is widely used in memory recycling, string operations, etc. Zval's reference counting is implemented through the member variables is_ref and ref_count. Through reference counting, multiple variables can share the same data. Avoid the huge consumption caused by frequent copying. During the assignment operation, zend points the variable to the same zval and ref_count++, and during the unset operation, the corresponding ref_count-1. The destruction operation will only be performed when ref_count is reduced to 0. If it is a reference assignment, zend will modify is_ref to 1.
PHP variables realize variable sharing data through reference counting. What if you change the value of one of the variables? When trying to write a variable, if Zend finds that the zval pointed to by the variable is shared by multiple variables, it will copy a zval with a ref_count of 1 and decrement the refcount of the original zval. This process is called "zval separation". It can be seen that zend only performs copy operations when a write operation occurs, so it is also called copy-on-write (copy on write)
For reference variables, the requirements are the same as non-reference On the contrary, variables assigned by reference must be bundled. Modifying one variable modifies all bundled variables.
How are local variables and global variables implemented in PHP? For a request, PHP can see two symbol tables (symbol_table and active_symbol_table) at any time, of which the former is used to maintain global variables. The latter is a pointer pointing to the currently active variable symbol table. When the program enters a function, zend will allocate a symbol table x to it and point active_symbol_table to a. In this way, the distinction between global and local variables is achieved.
Get variable values: PHP's symbol table is implemented through hash_table. Each variable is assigned a unique identifier. When obtaining, the corresponding zval is found from the table according to the identifier and returned.
Using global variables in functions: In functions, we can use global variables by explicitly declaring global. Create a reference to the variable with the same name in symbol_table in active_symbol_table (if the value of the reference variable needs to be updated, everyone will update it together). If there is no variable with the same name in symbol_table, it will be created first.
For more PHP execution principles and related articles, please pay attention to the PHP Chinese website!