


Abstract Memory management has a very important impact on long-running programs, such as server daemons; therefore, understanding how PHP allocates and releases memory is extremely important for creating such programs. This article will focus on PHP's memory management issues.
1. Memory
In PHP, filling a string variable is very simple. It only requires one statement "<?php $str = 'hello world '; ?>". And the string can be freely modified, copied and moved. In C language, although you can write a simple static string such as "char *str = "hello world ";"; however, you cannot modify the string because it lives in the program space. In order to create a manipulated string, you must allocate a block of memory and copy its contents through a function (such as strdup()).
{
char *str;
str = strdup("hello world ");
if (!str) {
fprintf(stderr, "Unable to allocate memory!");
}
}
Since we will analyze it later For various reasons, traditional memory management functions (such as malloc(), free(), strdup(), realloc(), calloc(), etc.) can hardly be used directly by PHP source code.
2. Release memory
On almost all platforms, memory management is implemented through a request and release model. First, an application asks the layer below it (usually the "operating system"): "I want to use some memory space". If there is free space, the operating system provides it to the program and marks it so that it will not be allocated to other programs.
When the application has finished using this memory, it should be returned to the OS; in this way, it can continue to be allocated to other programs. If the program does not return this memory, the OS has no way of knowing if this memory is no longer in use and can be allocated to another process. If a block of memory is not freed and the owner application loses it, the application is said to be "vulnerable" because the memory is no longer available to other programs.
In a typical client application, small, infrequent memory leaks can sometimes be "tolerated" by the OS because the leaked memory is implicitly returned when the process later terminates. to the OS. This is okay because the OS knows which program it allocated the memory to, and it can be sure that the memory will no longer be needed when the program terminates.
For long-running server daemons, including web servers like Apache and extended php modules, processes are often designed to run for a long time. Because the OS cannot clean up memory usage, any program leak - no matter how small - will cause repeated operations and eventually exhaust all system resources.
Now, let's consider the stristr() function in user space; in order to find a string using a case-insensitive search, it actually creates a small copy of each of the two strings, and then Performs a more traditional case-sensitive search to find relative offsets. However, after locating the offset of the string, it no longer uses these lowercase versions of the string. If it doesn't free these copies, every script that uses stristr() will leak some memory every time it is called. Eventually, the web server process will have all the system memory but will not be able to use it.
You can safely say that the ideal solution is to write good, clean, consistent code. This is certainly true; however, in an environment like the PHP interpreter, this view is only half true.
3. Error handling
In order to "jump out" an active request for user space scripts and their dependent extension functions, a method needs to be used to completely "jump out" of an active request. This is implemented within the Zend Engine: set a "jump out" address at the beginning of a request, then perform a longjmp() to jump on any die() or exit() call or on any critical error (E_ERROR) Go to that "bounce" address.
Although this "jumping out" process can simplify the flow of program execution, in most cases, this will mean that the resource cleanup code part (such as the free() call) will be skipped and eventually lead to A memory leak has occurred. Now, let’s consider the following simplified version of the engine code that handles function calls:
void call_function(const char *fname, int fname_len TSRMLS_DC){
zend_function *fe;
char *lcase_fname;
/* PHP function names are case-insensitive,
*To simplify locating them in the function table,
* All function names are implicitly translated to lowercase
*/
lcase_fname = estrndup(fname, fname_len);
zend_str_tolower( lcase_fname, fname_len);
if (zend_hash_find(EG(function_table), lcase_fname, fname_len + 1, (void **)&fe) == FAILURE) {
zend_execute(fe->op_array TSRMLS_CC);
} else {
php_error_docref(NULL TSRMLS_CC, E_ERROR, "Call to undefined function: %s()", fname);
}
efree(lcase_fname);
}
When the php_error_docref() line is executed, the internal error handler will understand that the error level is critical, and call longjmp() accordingly to interrupt the current program flow and leave the call_function() function, or even not execute it at all Go to the line efree(lcase_fname). You might want to move the efree() line above the zend_error() line; but what about the line that calls the call_function() routine? fname itself is probably an allocated string, and you can't free it at all until it's been used by error message processing.
Note that this php_error_docref() function is an internal equivalent implementation of the trigger_error() function. Its first parameter is an optional document reference that will be added to the docref. The third parameter can be any of the familiar E_* family constants used to indicate the severity of the error. The fourth (last) argument follows printf()-style formatting and variable argument list style.
4. Zend Memory Manager
One of the solutions to solve the memory leak during the above "bounce" request is to use the Zend Memory Management (ZendMM) layer. This part of the engine is very similar to the memory management behavior of the operating system - allocating memory to the calling program. The difference is that it is very low in the process space and is "request aware"; this way, when a request ends, it can perform the same behavior as the OS does when a process terminates. That is, it will implicitly release all memory occupied by the request. Figure 1 shows the relationship between ZendMM and the OS and PHP processes.

Figure 1. The Zend memory manager replaces system calls to implement memory allocation for each request.
In addition to providing implicit memory clearing function, ZendMM can also control the usage of each memory request according to the setting of memory_limit in php.ini. If a script attempts to request more memory than is available on the system, or greater than the maximum amount it should request at a time, ZendMM will automatically issue an E_ERROR message and start the appropriate "exit" process. An additional advantage of this approach is that the return value of most memory allocation calls does not need to be checked, since failure will result in an immediate jump to the exit part of the engine.
The principle of "hooking" PHP's internal code with the actual memory management of the OS is not complicated: all internally allocated memory is implemented using a specific set of optional functions. For example, instead of using malloc(16) to allocate a 16-byte block of memory, the PHP code uses emalloc(16). In addition to performing the actual memory allocation tasks, ZendMM also marks the memory block with the corresponding binding request type; this way, when a request "bounces", ZendMM can implicitly release it.
Often, memory needs to be allocated for a period of time longer than the duration of a single request. This type of allocation (called a "persistent allocation" because it persists after a request is completed) can be implemented using a traditional memory allocator because these allocations do not add the extra overhead that ZendMM uses. information for each request. Sometimes, however, it is not determined until runtime whether a particular allocation requires a permanent allocation, so ZendMM exports a set of helper macros that behave like other memory allocation functions, but use a last extra parameter to indicate whether it is permanent. distribute.
If you really want to implement a permanent allocation, then this parameter should be set to 1; in this case, the request is passed through the traditional malloc() allocator family. However, if the runtime logic determines that this block does not require permanent allocation; then, this parameter can be set to zero and the call will be adjusted to the memory allocator function for each request.
For example, pemalloc(buffer_len, 1) will be mapped to malloc(buffer_len), and pemalloc(buffer_len, 0) will be mapped to emalloc(buffer_len) using the following statement:
#define in Zend/zend_alloc .h:
#define pemalloc(size, persistent) ((persistent)?malloc(size): emalloc(size))
All these allocator functions provided in ZendMM can be obtained from the following table Its more traditional counterpart is found in .
Table 1 shows each allocator function supported by ZendMM and their e/pe corresponding implementation:
Table 1. Traditional versus PHP-specific allocators.
分配器函数 | e/pe对应实现 |
void *malloc(size_t count); | void *emalloc(size_t count);void *pemalloc(size_t count,char persistent); |
void *calloc(size_t count); | void *ecalloc(size_t count);void *pecalloc(size_t count,char persistent); |
void *realloc(void *ptr,size_t count); | void *erealloc(void *ptr,size_t count); void *perealloc(void *ptr,size_t count,char persistent); |
void *strdup(void *ptr); | void *estrdup(void *ptr);void *pestrdup(void *ptr,char persistent); |
void free(void *ptr); | void efree(void *ptr); void pefree(void *ptr,char persistent); |
You may notice that even the pefree() function requires the use of the permanent flag. This is because when pefree() is called, it doesn't actually know whether ptr is a permanent allocation. Calling free() on a non-persistent allocation can cause double the space freed, while calling efree() on a permanent allocation may cause a segfault because the memory manager will try to find management information that does not exist. . Therefore, your code needs to remember whether the data structure it allocates is persistent.
In addition to the core part of the allocator function, there are also some other very convenient ZendMM-specific functions, such as:
void *estrndup(void *ptr, int len);
The The function allocates len+1 bytes of memory and copies len bytes from ptr to the newly allocated block. The behavior of this estrndup() function can be roughly described as follows:
void *estrndup(void * ptr, int len)
{
char *dst = emalloc(len + 1);
memcpy(dst, ptr, len);
dst[len] = 0;
return dst ;
}
Here, the NULL byte that is implicitly placed at the end of the buffer ensures that any function that uses estrndup() to implement a string copy operation does not need to worry about copying the result. The buffer is passed to a function such as printf() that expects a NULL terminator. When using estrndup() to copy non-string data, the last byte is essentially wasted, but the advantages clearly outweigh the disadvantages.
void *safe_emalloc(size_t size, size_t count, size_t addtl);
void *safe_pemalloc(size_t size, size_t count, size_t addtl, char persistent);
These functions allocate The final size of the memory space is ((size*count)+addtl). You may ask: "Why provide additional functions? Why not use an emalloc/pemalloc?" The reason is simple: for safety. Although sometimes the possibility is quite small, it is this "very small possibility" that causes the memory of the host platform to overflow. This may result in the allocation of a negative number of bytes of space, or, even worse, in the allocation of a smaller number of bytes than the calling program requires. Safe_emalloc() can avoid this type of trap by checking for integer overflow and explicitly pre-ending when such an overflow occurs.
Note that not all memory allocation routines have a corresponding p* equivalent implementation. For example, pestrndup() does not exist, and safe_pemalloc() did not exist before PHP 5.1.
5. Reference Counting
Prudent memory allocation and release has an extremely significant impact on the long-term performance of PHP (which is a multi-request process); however, this is just a problem half. In order for a server that handles thousands of hits per second to run efficiently, each request needs to use as little memory as possible and minimize unnecessary data copying operations. Consider the following PHP code snippet:
<?php
$a = 'Hello World';
$b = $a;
unset($a);
?>
After the first call, only one variable is created, and one A 12-byte block of memory is assigned to it to store the string "Hello World", including a terminating NULL character. Now, let's look at the next two lines: $b is set to the same value as the variable $a, and then the variable $a is released.
If PHP has to copy the variable content for each variable assignment, then for the string to be copied in the above example, an additional 12 bytes will need to be copied, and additional data will be copied during the data copy. processor is loaded. This behavior seems a bit ridiculous at first, because when the third line of code appears, the original variables are released, making the entire data copy completely unnecessary. In fact, let's think a little further and imagine what happens when the contents of a 10MB file are loaded into two variables. This will take up 20MB of space, and at this point, 10 is enough. Would the engine waste so much time and memory on such a useless endeavor?
You should know that the designers of PHP have already understood this.
Remember that in the engine, variable names and their values are actually two different concepts. The value itself is an unnamed zval* storage (in this case, a string value), which is assigned to the variable $a via zend_hash_add(). What happens if both variable names point to the same value?
{
zval *helloval;
MAKE_STD_ZVAL(helloval);
ZVAL_STRING(helloval, "Hello World", 1);
zend_hash_add(EG(active_symbol_table), "a" , sizeof("a"), &helloval, sizeof(zval*), NULL);
zend_hash_add(EG(active_symbol_table), "b", sizeof("b"), &helloval, sizeof(zval*), NULL) ;
}
At this point, you can actually look at $a or $b and see that they both contain the string "Hello World". Unfortunately, next, you continue to execute the third line of code "unset($a);". At this time, unset() does not know that the data pointed to by the $a variable is also used by another variable, so it just blindly releases the memory. Any subsequent access to variable $b will be interpreted as freed memory space and thus cause the engine to crash.
This problem can be solved with the help of the fourth member refcount of zval (which has several forms). When a variable is first created and assigned a value, its refcount is initialized to 1 because it is assumed to be used only by the corresponding variable when it was originally created. When your code snippet starts assigning helloval to $b, it needs to increase the value of refcount to 2; thus, the value is now referenced by two variables:
{
zval *helloval;
MAKE_STD_ZVAL(helloval);
ZVAL_STRING(helloval, "Hello World", 1);
zend_hash_add(EG(active_symbol_table), "a", sizeof("a"), &helloval, sizeof(zval*), NULL);
ZVAL_ADDREF(helloval);
zend_hash_add(EG(active_symbol_table), "b", sizeof("b"), &helloval, sizeof(zval*), NULL);
}
Now, when unset() deletes the corresponding copy of $a of the original variable , it will be able to see from the refcount parameter that there is someone else interested in the data; therefore, it should just decrement the refcount value and forget about it.
6. Copy on Write
It is indeed a good idea to save memory through refcounting, but what happens when you only want to change the value of one of the variables? To do this, consider the following code snippet:
<?php
$a = 1;
$b = $a;
$b += 5;
?>
Through the above logical flow, of course you know that the value of $a is still equal to 1, and the value of $b will end up being 6. And at this point, you also know that Zend is trying to save memory - by making $a and $b both reference the same zval (see second line of code). So, what happens when execution reaches the third line and the value of the $b variable must be changed?
The answer is that Zend looks at the value of refcount and makes sure to separate it when its value is greater than 1. In Zend Engine, detachment is the process of destroying a reference pair, which is the opposite of the process you just saw:
zval *get_var_and_separate(char *varname, int varname_len TSRMLS_DC)
{
zval **varval, *varcopy;
if (zend_hash_find(EG(active_symbol_table), varname, varname_len + 1 , (void**)&varval) == FAILURE) {
/* The variable does not exist at all - failure leads to exit */
return NULL;
}
if ((*varval)- >refcount < 2) {
/* varname is the only actual reference,
* does not need to be separated
*/
return *varval;
}
/* Otherwise, then Copy the value of zval* */
MAKE_STD_ZVAL(varcopy); ;
/*Delete the old version of varname
*This will reduce the value of varval’s refcount in the process
*/
zend_hash_del(EG(active_symbol_table), varname, varname_len + 1);
/*Initialize the reference count of the newly created value and attach it to the
* varname variable >zend_hash_add(EG(active_symbol_table), varname, varname_len + 1, &varcopy, sizeof(zval*), NULL);
/*Return new zval* */
return varcopy;
}
Now, since the engine has a zval* owned only by the variable $b (the engine knows this), it can convert this value to a long and increment it as requested by the script 5.
7. Change-on-write
The introduction of the reference counting concept also leads to a new data operation possibility, which looks similar to the user space script manager "Quote" has a certain relationship. Consider the following user space code snippet:
Copy the code
The code is as follows:
In the PHP code above, you can see that the value of $a is now 6, even though it started out as 1 and never changed (directly). This happens because when the engine starts incrementing the value of $b by 5, it notices that $b is a reference to $a and thinks "I can change that value without detaching it because I want to use All reference variables will see this change".
But how does the engine know? It's simple, it just looks at the fourth and last element of the zval structure (is_ref). This is a simple on/off bit that defines whether the value is actually part of a userspace style reference set. In the previous code snippet, when the first line is executed, the value created for $a gets a refcount of 1 and an is_ref value of 0 because it is owned by only one variable ($a) and no other variables Make write reference changes to it. On the second line, the refcount element of this value is increased to 2, except this time the is_ref element is set to 1 (because the script includes an "&" symbol to indicate a full reference).
Finally, on the third line, the engine once again takes out the value associated with the variable $b and checks whether a separation is necessary. This time the value is not separated because a check was not included previously. The following is part of the code related to refcount check in the get_var_and_separate() function:
if (( *varval)->is_ref || (*varval)->refcount < 2) {
/* varname is the only actual reference,
* or it is a full reference to another variable
* any One way: no separation is performed
*/
return *varval; A full quote. The engine can modify it freely without caring about changes in other variable values.
Although the copying and referencing technologies discussed above already exist, there are still some problems that cannot be solved by is_ref and refcount operations. Consider the following PHP code block:
Copy code
?>
Here, you have a variable that needs to be associated with three different variables value. Two variables are fully referenced using "change-on-write", while the third variable is in a detachable "copy-on-write" context. If only is_ref and refcount were used to describe this relationship, what values would work?
The answer is: None of them work. In this case, the value must be copied into two separate zval*s, although both contain exactly the same data (see Figure 2).
Figure 2. Forced separation when referencing
Similarly, the following code block will cause the same conflict and force a copy of the value (see Figure 3).
Figure 3. Forced separation during copying
?>
Note that in both cases here, $b is associated with the original zval object, because when detaching This occurs when the engine has no way of knowing the name of the third variable involved in the operation.
PHP is a hosting language. From the average user's perspective, this careful control of resources and memory means easier prototyping and fewer conflicts. However, when we go deep "inside", all promises seem to disappear, and ultimately we have to rely on truly responsible developers to maintain the consistency of the entire runtime environment.
www.bkjia.com

Python解析XML中的特殊字符和转义序列XML(eXtensibleMarkupLanguage)是一种常用的数据交换格式,用于在不同系统之间传输和存储数据。在处理XML文件时,经常会遇到包含特殊字符和转义序列的情况,这可能会导致解析错误或者误解数据。因此,在使用Python解析XML文件时,我们需要了解如何处理这些特殊字符和转义序列。一、特殊字符和

Python编程解析百度地图API文档中的坐标转换功能导读:随着互联网的快速发展,地图定位功能已经成为现代人生活中不可或缺的一部分。而百度地图作为国内最受欢迎的地图服务之一,提供了一系列的API供开发者使用。本文将通过Python编程,解析百度地图API文档中的坐标转换功能,并给出相应的代码示例。一、引言在开发中,我们有时会涉及到坐标的转换问题。百度地图AP

使用Python解析SOAP消息SOAP(SimpleObjectAccessProtocol)是一种基于XML的远程过程调用(RPC)协议,用于在网络上不同的应用程序之间进行通信。Python提供了许多库和工具来处理SOAP消息,其中最常用的是suds库。suds是Python的一个SOAP客户端库,可以用于解析和生成SOAP消息。它提供了一种简单而

随着PHP8.0的发布,许多新特性都被引入和更新了,其中包括XML解析库。PHP8.0中的XML解析库提供了更快的解析速度和更好的可读性,这对于PHP开发者来说是一个重要的提升。在本文中,我们将探讨PHP8.0中的XML解析库的新特性以及如何使用它。什么是XML解析库?XML解析库是一种软件库,用于解析和处理XML文档。XML是一种用于将数据存储为结构化文档

使用Python解析带有命名空间的XML文档XML是一种常用的数据交换格式,能够适应各种应用场景。在处理XML文档时,有时会遇到带有命名空间(namespace)的情况。命名空间可以防止不同XML文档中元素名的冲突,提高了XML的灵活性和可扩展性。本文将介绍如何使用Python解析带有命名空间的XML文档,并给出相应的代码示例。首先,我们需要导入xml.et

PHP中的HTTPBasic鉴权方法解析及应用HTTPBasic鉴权是一种简单但常用的身份验证方法,它通过在HTTP请求头中添加用户名和密码的Base64编码字符串进行身份验证。本文将介绍HTTPBasic鉴权的原理和使用方法,并提供PHP代码示例供读者参考。一、HTTPBasic鉴权原理HTTPBasic鉴权的原理非常简单,当客户端发送一个请求时

PHP爬虫是一种自动化获取网页信息的程序,它可以获取网页代码、抓取数据并存储到本地或数据库中。使用爬虫可以快速获取大量的数据,为后续的数据分析和处理提供巨大的帮助。本文将介绍如何使用PHP实现一个简单的爬虫,以获取网页源码和内容解析。一、获取网页源码在开始之前,我们应该先了解一下HTTP协议和HTML的基本结构。HTTP是HyperText

PHP中的单点登录(SSO)鉴权方法解析引言:随着互联网的发展,用户通常要同时访问多个网站进行各种操作。为了提高用户体验,单点登录(SingleSign-On,简称SSO)应运而生。本文将探讨PHP中的SSO鉴权方法,并提供相应的代码示例。一、什么是单点登录(SSO)?单点登录(SSO)是一种集中化认证的方法,在多个应用系统中,用户只需要登录一次,就能访问


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

Zend Studio 13.0.1
Powerful PHP integrated development environment

Atom editor mac version download
The most popular open source editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Dreamweaver Mac version
Visual web development tools