Home > Article > Backend Development > Analyze memory management in PHP, PHP dynamically allocates and releases memory_PHP tutorial

Analyze memory management in PHP, PHP dynamically allocates and releases memory_PHP tutorial

WBOYOriginal: 2016-07-21 15:02:15944browse

Abstract Memory management has a very important impact on long-running programs, such as server daemons; therefore, understanding how PHP allocates and releases memory is extremely important for creating such programs. This article will focus on PHP's memory management issues.

1. Memory
In PHP, filling a string variable is very simple. It only requires one statement "＜?php $str = 'hello world '; ?＞". And the string can be freely modified, copied and moved. In C language, although you can write a simple static string such as "char *str = "hello world ";"; however, you cannot modify the string because it lives in the program space. In order to create a manipulated string, you must allocate a block of memory and copy its contents through a function (such as strdup()).

Copy code The code is as follows:

{
char *str;
str = strdup("hello world ");
　if (!str) {
fprintf(stderr, "Unable to allocate memory!");
　}
}

Since we will analyze it later For various reasons, traditional memory management functions (such as malloc(), free(), strdup(), realloc(), calloc(), etc.) can hardly be used directly by PHP source code.

2. Release memory
On almost all platforms, memory management is implemented through a request and release model. First, an application asks the layer below it (usually the "operating system"): "I want to use some memory space". If there is free space, the operating system provides it to the program and marks it so that it will not be allocated to other programs.

When the application has finished using this memory, it should be returned to the OS; in this way, it can continue to be allocated to other programs. If the program does not return this memory, the OS has no way of knowing if this memory is no longer in use and can be allocated to another process. If a block of memory is not freed and the owner application loses it, the application is said to be "vulnerable" because the memory is no longer available to other programs.

In a typical client application, small, infrequent memory leaks can sometimes be "tolerated" by the OS because the leaked memory is implicitly returned when the process later terminates. to the OS. This is okay because the OS knows which program it allocated the memory to, and it can be sure that the memory will no longer be needed when the program terminates.

For long-running server daemons, including web servers like Apache and extended php modules, processes are often designed to run for a long time. Because the OS cannot clean up memory usage, any program leak - no matter how small - will cause repeated operations and eventually exhaust all system resources.

Now, let's consider the stristr() function in user space; in order to find a string using a case-insensitive search, it actually creates a small copy of each of the two strings, and then Performs a more traditional case-sensitive search to find relative offsets. However, after locating the offset of the string, it no longer uses these lowercase versions of the string. If it doesn't free these copies, every script that uses stristr() will leak some memory every time it is called. Eventually, the web server process will have all the system memory but will not be able to use it.

You can safely say that the ideal solution is to write good, clean, consistent code. This is certainly true; however, in an environment like the PHP interpreter, this view is only half true.

3. Error handling
In order to "jump out" an active request for user space scripts and their dependent extension functions, a method needs to be used to completely "jump out" of an active request. This is implemented within the Zend Engine: set a "jump out" address at the beginning of a request, then perform a longjmp() to jump on any die() or exit() call or on any critical error (E_ERROR) Go to that "bounce" address.

Although this "jumping out" process can simplify the flow of program execution, in most cases, this will mean that the resource cleanup code part (such as the free() call) will be skipped and eventually lead to A memory leak has occurred. Now, let’s consider the following simplified version of the engine code that handles function calls:

Copy the code The code is as follows:

void call_function(const char *fname, int fname_len TSRMLS_DC){
zend_function *fe;
char *lcase_fname;
/* PHP function names are case-insensitive, 
 *To simplify locating them in the function table, 
 * All function names are implicitly translated to lowercase 
 */
 lcase_fname = estrndup(fname, fname_len);
zend_str_tolower( lcase_fname, fname_len);
　if (zend_hash_find(EG(function_table), lcase_fname, fname_len + 1, (void **)&fe) == FAILURE) {
zend_execute(fe-＞op_array TSRMLS_CC);
 } else {
php_error_docref(NULL TSRMLS_CC, E_ERROR, "Call to undefined function: %s()", fname);
　}
efree(lcase_fname);
}

When the php_error_docref() line is executed, the internal error handler will understand that the error level is critical, and call longjmp() accordingly to interrupt the current program flow and leave the call_function() function, or even not execute it at all Go to the line efree(lcase_fname). You might want to move the efree() line above the zend_error() line; but what about the line that calls the call_function() routine? fname itself is probably an allocated string, and you can't free it at all until it's been used by error message processing.

Note that this php_error_docref() function is an internal equivalent implementation of the trigger_error() function. Its first parameter is an optional document reference that will be added to the docref. The third parameter can be any of the familiar E_* family constants used to indicate the severity of the error. The fourth (last) argument follows printf()-style formatting and variable argument list style.

4. Zend Memory Manager
One of the solutions to solve the memory leak during the above "bounce" request is to use the Zend Memory Management (ZendMM) layer. This part of the engine is very similar to the memory management behavior of the operating system - allocating memory to the calling program. The difference is that it is very low in the process space and is "request aware"; this way, when a request ends, it can perform the same behavior as the OS does when a process terminates. That is, it will implicitly release all memory occupied by the request. Figure 1 shows the relationship between ZendMM and the OS and PHP processes.

深入探讨PHP中的内存管理问题

Figure 1. The Zend memory manager replaces system calls to implement memory allocation for each request.

In addition to providing implicit memory clearing function, ZendMM can also control the usage of each memory request according to the setting of memory_limit in php.ini. If a script attempts to request more memory than is available on the system, or greater than the maximum amount it should request at a time, ZendMM will automatically issue an E_ERROR message and start the appropriate "exit" process. An additional advantage of this approach is that the return value of most memory allocation calls does not need to be checked, since failure will result in an immediate jump to the exit part of the engine.

The principle of "hooking" PHP's internal code with the actual memory management of the OS is not complicated: all internally allocated memory is implemented using a specific set of optional functions. For example, instead of using malloc(16) to allocate a 16-byte block of memory, the PHP code uses emalloc(16). In addition to performing the actual memory allocation tasks, ZendMM also marks the memory block with the corresponding binding request type; this way, when a request "bounces", ZendMM can implicitly release it.

Often, memory needs to be allocated for a period of time longer than the duration of a single request. This type of allocation (called a "persistent allocation" because it persists after a request is completed) can be implemented using a traditional memory allocator because these allocations do not add the extra overhead that ZendMM uses. information for each request. Sometimes, however, it is not determined until runtime whether a particular allocation requires a permanent allocation, so ZendMM exports a set of helper macros that behave like other memory allocation functions, but use a last extra parameter to indicate whether it is permanent. distribute.

If you really want to implement a permanent allocation, then this parameter should be set to 1; in this case, the request is passed through the traditional malloc() allocator family. However, if the runtime logic determines that this block does not require permanent allocation; then, this parameter can be set to zero and the call will be adjusted to the memory allocator function for each request.

For example, pemalloc(buffer_len, 1) will be mapped to malloc(buffer_len), and pemalloc(buffer_len, 0) will be mapped to emalloc(buffer_len) using the following statement:
#define in Zend/zend_alloc .h:
#define pemalloc(size, persistent) ((persistent)?malloc(size): emalloc(size))

All these allocator functions provided in ZendMM can be obtained from the following table Its more traditional counterpart is found in .
Table 1 shows each allocator function supported by ZendMM and their e/pe corresponding implementation:
Table 1. Traditional versus PHP-specific allocators.

分配器函数	e/pe对应实现
void *malloc(size_t count);	void emalloc(size_t count);void pemalloc(size_t count，char persistent);
void *calloc(size_t count);	void ecalloc(size_t count);void pecalloc(size_t count，char persistent);
void realloc(void ptr，size_t count);	void erealloc(void ptr，size_t count); void perealloc(void ptr，size_t count，char persistent);
void strdup(void ptr);	void estrdup(void ptr);void pestrdup(void ptr，char persistent);
void free(void *ptr);	void efree(void ptr); void pefree(void ptr，char persistent);

You may notice that even the pefree() function requires the use of the permanent flag. This is because when pefree() is called, it doesn't actually know whether ptr is a permanent allocation. Calling free() on a non-persistent allocation can cause double the space freed, while calling efree() on a permanent allocation may cause a segfault because the memory manager will try to find management information that does not exist. . Therefore, your code needs to remember whether the data structure it allocates is persistent.
In addition to the core part of the allocator function, there are also some other very convenient ZendMM-specific functions, such as:
void *estrndup(void *ptr, int len);
The The function allocates len+1 bytes of memory and copies len bytes from ptr to the newly allocated block. The behavior of this estrndup() function can be roughly described as follows:

Copy code The code is as follows:

void *estrndup(void * ptr, int len)
{
char *dst = emalloc(len + 1);
memcpy(dst, ptr, len);
dst[len] = 0;
return dst ;
}

Here, the NULL byte that is implicitly placed at the end of the buffer ensures that any function that uses estrndup() to implement a string copy operation does not need to worry about copying the result. The buffer is passed to a function such as printf() that expects a NULL terminator. When using estrndup() to copy non-string data, the last byte is essentially wasted, but the advantages clearly outweigh the disadvantages.
void *safe_emalloc(size_t size, size_t count, size_t addtl);
void *safe_pemalloc(size_t size, size_t count, size_t addtl, char persistent);
These functions allocate The final size of the memory space is ((size*count)+addtl). You may ask: "Why provide additional functions? Why not use an emalloc/pemalloc?" The reason is simple: for safety. Although sometimes the possibility is quite small, it is this "very small possibility" that causes the memory of the host platform to overflow. This may result in the allocation of a negative number of bytes of space, or, even worse, in the allocation of a smaller number of bytes than the calling program requires. Safe_emalloc() can avoid this type of trap by checking for integer overflow and explicitly pre-ending when such an overflow occurs.
Note that not all memory allocation routines have a corresponding p* equivalent implementation. For example, pestrndup() does not exist, and safe_pemalloc() did not exist before PHP 5.1.

5. Reference Counting
Prudent memory allocation and release has an extremely significant impact on the long-term performance of PHP (which is a multi-request process); however, this is just a problem half. In order for a server that handles thousands of hits per second to run efficiently, each request needs to use as little memory as possible and minimize unnecessary data copying operations. Consider the following PHP code snippet:

Copy the code The code is as follows:

＜?php
$a = 'Hello World';
$b = $a;
unset($a);
?＞

After the first call, only one variable is created, and one A 12-byte block of memory is assigned to it to store the string "Hello World", including a terminating NULL character. Now, let's look at the next two lines: $b is set to the same value as the variable $a, and then the variable $a is released.

If PHP has to copy the variable content for each variable assignment, then for the string to be copied in the above example, an additional 12 bytes will need to be copied, and additional data will be copied during the data copy. processor is loaded. This behavior seems a bit ridiculous at first, because when the third line of code appears, the original variables are released, making the entire data copy completely unnecessary. In fact, let's think a little further and imagine what happens when the contents of a 10MB file are loaded into two variables. This will take up 20MB of space, and at this point, 10 is enough. Would the engine waste so much time and memory on such a useless endeavor?
You should know that the designers of PHP have already understood this.

Remember that in the engine, variable names and their values are actually two different concepts. The value itself is an unnamed zval* storage (in this case, a string value), which is assigned to the variable $a via zend_hash_add(). What happens if both variable names point to the same value?

Copy code The code is as follows:

{
zval *helloval;
MAKE_STD_ZVAL(helloval);
ZVAL_STRING(helloval, "Hello World", 1);
zend_hash_add(EG(active_symbol_table), "a" , sizeof("a"), &helloval, sizeof(zval*), NULL);
 zend_hash_add(EG(active_symbol_table), "b", sizeof("b"), &helloval, sizeof(zval*), NULL) ;
}

At this point, you can actually look at $a or $b and see that they both contain the string "Hello World". Unfortunately, next, you continue to execute the third line of code "unset($a);". At this time, unset() does not know that the data pointed to by the $a variable is also used by another variable, so it just blindly releases the memory. Any subsequent access to variable $b will be interpreted as freed memory space and thus cause the engine to crash.

This problem can be solved with the help of the fourth member refcount of zval (which has several forms). When a variable is first created and assigned a value, its refcount is initialized to 1 because it is assumed to be used only by the corresponding variable when it was originally created. When your code snippet starts assigning helloval to $b, it needs to increase the value of refcount to 2; thus, the value is now referenced by two variables:

Copy Code The code is as follows:

{
zval *helloval;
MAKE_STD_ZVAL(helloval);
ZVAL_STRING(helloval, "Hello World", 1); 
zend_hash_add(EG(active_symbol_table), "a", sizeof("a"), &helloval, sizeof(zval*), NULL);
ZVAL_ADDREF(helloval);
zend_hash_add(EG(active_symbol_table), "b", sizeof("b"), &helloval, sizeof(zval*), NULL);
}

Now, when unset() deletes the corresponding copy of $a of the original variable , it will be able to see from the refcount parameter that there is someone else interested in the data; therefore, it should just decrement the refcount value and forget about it.

6. Copy on Write
It is indeed a good idea to save memory through refcounting, but what happens when you only want to change the value of one of the variables? To do this, consider the following code snippet:

Copy the code The code is as follows:

＜?php
$a = 1;
$b = $a;
$b += 5;
?＞

Through the above logical flow, of course you know that the value of $a is still equal to 1, and the value of $b will end up being 6. And at this point, you also know that Zend is trying to save memory - by making $a and $b both reference the same zval (see second line of code). So, what happens when execution reaches the third line and the value of the $b variable must be changed?

The answer is that Zend looks at the value of refcount and makes sure to separate it when its value is greater than 1. In Zend Engine, detachment is the process of destroying a reference pair, which is the opposite of the process you just saw:

Copy code The code is as follows:

zval *get_var_and_separate(char *varname, int varname_len TSRMLS_DC)
{
　zval **varval, *varcopy;
if (zend_hash_find(EG(active_symbol_table), varname, varname_len + 1 , (void**)&varval) == FAILURE) {
/* The variable does not exist at all - failure leads to exit */
return NULL;
　}
　if ((*varval)- ＞refcount ＜ 2) {
/* varname is the only actual reference, 
* does not need to be separated
*/
return *varval;
　}
/* Otherwise, then Copy the value of zval* */
 MAKE_STD_ZVAL(varcopy); ;
/*Delete the old version of varname
*This will reduce the value of varval’s refcount in the process
*/
zend_hash_del(EG(active_symbol_table), varname, varname_len + 1);
/*Initialize the reference count of the newly created value and attach it to the 
* varname variable >zend_hash_add(EG(active_symbol_table), varname, varname_len + 1, &varcopy, sizeof(zval*), NULL);
/*Return new zval* */
return varcopy;
}
 

Now, since the engine has a zval* owned only by the variable $b (the engine knows this), it can convert this value to a long and increment it as requested by the script 5. 



7. Change-on-write 

The introduction of the reference counting concept also leads to a new data operation possibility, which looks similar to the user space script manager "Quote" has a certain relationship. Consider the following user space code snippet: 

Copy the code
The code is as follows:

＜?php$a = 1;$b = &$a;$b += 5;?＞

In the PHP code above, you can see that the value of $a is now 6, even though it started out as 1 and never changed (directly). This happens because when the engine starts incrementing the value of $b by 5, it notices that $b is a reference to $a and thinks "I can change that value without detaching it because I want to use All reference variables will see this change".

But how does the engine know? It's simple, it just looks at the fourth and last element of the zval structure (is_ref). This is a simple on/off bit that defines whether the value is actually part of a userspace style reference set. In the previous code snippet, when the first line is executed, the value created for $a gets a refcount of 1 and an is_ref value of 0 because it is owned by only one variable ($a) and no other variables Make write reference changes to it. On the second line, the refcount element of this value is increased to 2, except this time the is_ref element is set to 1 (because the script includes an "&" symbol to indicate a full reference).

Finally, on the third line, the engine once again takes out the value associated with the variable $b and checks whether a separation is necessary. This time the value is not separated because a check was not included previously. The following is part of the code related to refcount check in the get_var_and_separate() function:

Copy code The code is as follows:

if (( *varval)-＞is_ref || (*varval)-＞refcount ＜ 2) {
　/* varname is the only actual reference, 
　* or it is a full reference to another variable 
　* any One way: no separation is performed 
 */
 return *varval; A full quote. The engine can modify it freely without caring about changes in other variable values. 

8. Separation Problems
Although the copying and referencing technologies discussed above already exist, there are still some problems that cannot be solved by is_ref and refcount operations. Consider the following PHP code block:

Copy code

The code is as follows:＜?php$a = 1 ;

$b = $a;

$c = &$a;
?＞


Here, you have a variable that needs to be associated with three different variables value. Two variables are fully referenced using "change-on-write", while the third variable is in a detachable "copy-on-write" context. If only is_ref and refcount were used to describe this relationship, what values ​​would work? 
The answer is: None of them work. In this case, the value must be copied into two separate zval*s, although both contain exactly the same data (see Figure 2). 

Figure 2. Forced separation when referencing

深入探讨PHP中的内存管理问题 Similarly, the following code block will cause the same conflict and force a copy of the value (see Figure 3).

Figure 3. Forced separation during copying

深入探讨PHP中的内存管理问题

Copy code

The code is as follows:＜?php$a = 1;

$b = & $a;

$c = $a;
?＞


Note that in both cases here, $b is associated with the original zval object, because when detaching This occurs when the engine has no way of knowing the name of the third variable involved in the operation. 

9. Summary
PHP is a hosting language. From the average user's perspective, this careful control of resources and memory means easier prototyping and fewer conflicts. However, when we go deep "inside", all promises seem to disappear, and ultimately we have to rely on truly responsible developers to maintain the consistency of the entire runtime environment.

http://www.bkjia.com/PHPjc/327944.html

www.bkjia.com

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：Regarding the solution to the problem that the page script assignment fails due to carriage returns in the string when passing parameters in the url address_PHP TutorialNext article：Regarding the solution to the problem that the page script assignment fails due to carriage returns in the string when passing parameters in the url address_PHP Tutorial

See more

分配器函数	e/pe对应实现
void *malloc(size_t count);	void emalloc(size_t count);void pemalloc(size_t count，char persistent);
void *calloc(size_t count);	void ecalloc(size_t count);void pecalloc(size_t count，char persistent);
void realloc(void ptr，size_t count);	void erealloc(void ptr，size_t count); void perealloc(void ptr，size_t count，char persistent);
void strdup(void ptr);	void estrdup(void ptr);void pestrdup(void ptr，char persistent);
void free(void *ptr);	void efree(void ptr); void pefree(void ptr，char persistent);

Analyze memory management in PHP, PHP dynamically allocates and releases memory_PHP tutorial

Related articles