Home  >  Article  >  php教程  >  Detailed explanation of ini configuration principle in php_php basics

Detailed explanation of ini configuration principle in php_php basics

WBOY
WBOYOriginal
2016-05-16 08:59:572149browse

Students who use PHP know that the php.ini configuration will take effect throughout the entire SAPI life cycle. During the execution of a php script, if you manually modify the ini configuration, it will not take effect. If you cannot restart apache or nginx at this time, you can only explicitly call the ini_set interface in the php code. ini_set is a function provided by PHP to dynamically modify the configuration. It should be noted that the configuration set using ini_set and the configuration set in the ini file have different effective time ranges. After the php script is executed, the ini_set settings will become invalid immediately.

Therefore, this article is divided into two parts. The first part explains the principle of php.ini configuration, and the second part talks about dynamically modifying the php configuration.

The configuration of php.ini will roughly involve three pieces of data, configuration_hash, EG (ini_directives), and PG, BG, PCRE_G, JSON_G, XXX_G, etc. It doesn’t matter if you don’t know the meaning of these three types of data, they will be explained in detail below.

1. Parse INI configuration file

Since php.ini needs to be in effect throughout the SAPI process, the work of parsing the ini file and building the php configuration accordingly must be the beginning of SAPI. In other words, it must occur during the startup process of PHP. PHP needs these configurations to be generated internally before any actual request arrives.

Reflected into the core of php, which is the php_module_startup function.

php_module_startup is mainly responsible for starting php. It is usually called when SAPI starts. btw, another common function is php_request_startup, which is responsible for initializing each request when it arrives. php_module_startup and php_request_startup are two iconic actions, but their analysis is beyond the scope of this article.

For example, when php is hooked into a module under apache, then when apache starts, all these modules will be activated, including the php module. When activating the php module, php_module_startup will be called. The php_module_startup function completes a lot of work. Once the php_module_startup call ends, it means, OK, php has been started and can now accept requests and respond.

In the php_module_startup function, the implementation related to parsing the ini file is:

Copy code The code is as follows:

/* this will read in php.ini, set up the configuration parameters,
Load zend extensions and register php function extensions
to be loaded later */
if (php_init_config(TSRMLS_C) == FAILURE) {
Return FAILURE;
}

As you can see, the php_init_config function is actually called to complete the parse of the ini file. The parse work mainly performs lex&grammar analysis, and extracts and saves the key and value pairs in the ini file. The format of php.ini is very simple, with key on the left side of the equal sign and value on the right side. Whenever a pair of kvs are extracted, where does php store them? The answer is the configuration_hash mentioned earlier.

static HashTable configuration_hash;
configuration_hash is declared in php_ini.c, which is a HashTable type data structure. As the name suggests, it is actually a hash table. As an aside, configuration_hash cannot be obtained in versions before php5.3 because it is a static variable in the php_ini.c file. Later, php5.3 added the php_ini_get_configuration_hash interface, which directly returns &configuration_hash, so that various PHP extensions can easily get a glimpse of the configuration_hash... What a great blessing...

Note four points:

First, php_init_config does not perform any verification other than lexical and syntax. In other words, if we add a line hello=world to the ini file, as long as this is a correctly formatted configuration item, then the final configuration_hash will contain an element with the key hello and the value world, and the configuration_hash will reflect it to the maximum extent. ini file.

Second, the ini file allows us to configure in the form of an array. For example, write the following three lines in the ini file:

Copy code The code is as follows:

drift.arr[]=1
drift.arr[]=2
drift.arr[]=3

Then in the final generated configuration_hash table, there will be an element with the key drift.arr, and its value is an array containing three numbers: 1, 2, and 3. This is an extremely rare configuration method.

Thirdly, php also allows us to build some additional ini files in addition to the default php.ini file (php-%s.ini to be precise). These ini files will be placed in an additional directory. This directory is specified by the environment variable PHP_INI_SCAN_DIR. After php_init_config has parsed php.ini, it will scan this directory again and find all the .ini files in the directory for analysis. The kv key-value pairs generated in these additional ini files will also be added to the configuration_hash.

This is an occasionally useful feature. If we develop a PHP extension ourselves but don't want to mix the configuration into php.ini, we can write another ini and tell PHP where to find it through PHP_INI_SCAN_DIR. Of course, its disadvantages are also obvious, and it requires setting additional environment variables to support it. A better solution is for developers to call php_parse_user_ini_file or zend_parse_ini_file themselves in the extension to parse the corresponding ini file.

Fourth, in configuration_hash, the key is a string, so what is the type of the value? The answer is also a string (except for the very special array mentioned above). Specifically, such as the following configuration:

Copy code The code is as follows:

display_errors = On
log_errors = Off
log_errors_max_len = 1024

Then the key-value pairs actually stored in the final configuration_hash are:

Copy code The code is as follows:

key: "display_errors"
val : "1"

key: "log_errors"
val : ""

key: "log_errors_max_len"
val : "1024"

Pay attention to log_errors, the value stored in it is not even "0", it is a real empty string. In addition, log_errors_max_len is not a number, but a string of 1024.

At this point in the analysis, basically everything related to parsing the ini file has been explained clearly. To briefly summarize:

1. Parsing ini occurs in the php_module_startup stage

2. The parsing results are stored in configuration_hash.

2. Configuration applies to modules

The general structure of PHP can be seen as a zend engine at the bottom, which is responsible for interacting with the OS, compiling PHP code, providing memory hosting, etc. There are many modules arranged on the upper layer of the zend engine. The core module is the Core module, and others include Standard, PCRE, Date, Session, etc... These modules also have another name called php extension. We can simply understand that each module provides a set of functional interfaces for developers to call. For example, commonly used built-in functions such as explode, trim, array, etc. are provided by the Standard module.

Why we need to talk about these is because in php.ini, in addition to some configurations for php itself, that is, for the Core module (such as safe_mode, display_errors, max_execution_time, etc.), there are quite a few configurations for other different modules. of.

For example, the date module provides common date, time, strtotime and other functions. In php.ini, its related configuration looks like:

Copy code The code is as follows:

[Date]
;date.timezone = 'Asia/Shanghai'
;date.default_latitude = 31.7667
;date.default_longitude = 35.2333
;date.sunrise_zenith = 90.583333
;date.sunset_zenith = 90.583333

In addition to these modules having independent configurations, the zend engine is also configurable, but the zend engine has very few configurable items, only error_reporting, zend.enable_gc and detect_unicode.

As we have mentioned in the previous section, php_module_startup will call php_init_config, whose purpose is to parse the ini file and generate configuration_hash. So what else will be done in php_module_startup next? Obviously, the configuration in configuration_hash will be applied to different modules such as Zend, Core, Standard, and SPL. Of course, this is not an overnight process, because PHP usually contains many modules, and these modules will also be started in sequence during PHP startup. Then, the process of configuring module A occurs during the startup process of module A.

Students with experience in extension development will point out directly that module A is started in PHP_MINIT_FUNCTION(A), isn't it?

Yes, if module A needs to be configured, then in PHP_MINIT_FUNCTION, you can call REGISTER_INI_ENTRIES() to complete it. REGISTER_INI_ENTRIES will search the configuration_hash for the configuration value set by the user based on the name of the configuration item required by the current module, and update it to the module's own global space.

2.1, Global space of module

To understand how to apply the ini configuration from configuration_hash to each module, it is necessary to first understand the global space of the php module. For different PHP modules, you can open up a storage space of your own, and this space is globally visible to the module. Generally speaking, it will be used to store the ini configuration required by the module. In other words, the configuration items in configuration_hash will eventually be stored in the global space. During the execution of the module, you only need to directly access this global space to get the user's settings for the module. Of course, it is also often used to record intermediate data during the execution of the module.

Let’s take the bcmath module as an example. bcmath is a PHP module that provides an interface for mathematical calculations. First, let’s take a look at its ini configuration:

Copy code The code is as follows:

PHP_INI_BEGIN()
STD_PHP_INI_ENTRY("bcmath.scale", "0", PHP_INI_ALL, OnUpdateLongGEZero, bc_precision, zend_bcmath_globals, bcmath_globals)
PHP_INI_END()

bcmath has only one configuration item. We can use bcmath.scale in php.ini to configure the bcmath module.

Next, continue to look at the global space definition of the bcmatch module. There is the following statement in php_bcmath.h:

Copy code The code is as follows:

ZEND_BEGIN_MODULE_GLOBALS(bcmath)
bc_num _zero_;
bc_num _one_;
bc_num _two_;
long bc_precision;
ZEND_END_MODULE_GLOBALS(bcmath)

After the macro is expanded, it is:

Copy code The code is as follows:

typedef struct _zend_bcmath_globals {
bc_num _zero_;
bc_num _one_;
bc_num _two_;
long bc_precision;
} zend_bcmath_globals;

In fact, the zend_bcmath_globals type is the global space type in the bcmath module. Only the zend_bcmath_globals structure is declared here, and there is a specific instantiation definition in bcmath.c:

//After expansion, it is zend_bcmath_globals bcmath_globals;
ZEND_DECLARE_MODULE_GLOBALS(bcmath)
It can be seen that the definition of the variable bcmath_globals is completed with ZEND_DECLARE_MODULE_GLOBALS.

bcmath_globals is a real global space, which contains four fields. Its last field, bc_precision, corresponds to bcmath.scale in the ini configuration. We set the value of bcmath.scale in php.ini, and then when starting the bcmath module, the value of bcmath.scale is updated to bcmath_globals.bc_precision.

Update the value in configuration_hash to the xxx_globals variable defined by each module, which is the so-called applying the ini configuration to the module. Once the module is started, these configurations are in place. Therefore, in the subsequent execution phase, the php module does not need to access the configuration_hash again. The module only needs to access its own XXX_globals to get the configuration set by the user.

bcmath_globals, in addition to one field for the ini configuration item, what are the other three fields? This is the second role of the module global space. In addition to being used for ini configuration, it can also store some data during module execution.

Another example is the json module, which is also a very commonly used module in PHP:

Copy code The code is as follows:

ZEND_BEGIN_MODULE_GLOBALS(json)
int error_code;
ZEND_END_MODULE_GLOBALS(json)

You can see that the json module does not require ini configuration, and its global space has only one field error_code. error_code records the errors that occurred in the last execution of json_decode or json_encode. The json_last_error function returns this error_code to help users locate the cause of the error.

In order to easily access module global space variables, PHP has conventionally proposed some macros. For example, if we want to access the error_code in json_globals, we can of course write it directly as json_globals.error_code (not available in a multi-threaded environment), but a more general way of writing it is to define the JSON_G macro:

Copy code The code is as follows:

#define JSON_G(v) (json_globals.v)

We use JSON_G(error_code) to access json_globals.error_code. At the beginning of this article, I mentioned PG, BG, JSON_G, PCRE_G, XXX_G, etc. These macros are also very common in PHP source code. Now we can easily understand them. The PG macro can access the global variables of the Core module, BG can access the global variables of the Standard module, and PCRE_G can access the global variables of the PCRE module.

Copy code The code is as follows:

#define PG(v) (core_globals.v)
#define BG(v) (basic_globals.v)

2.2. How to determine what configuration a module requires?

What kind of INI configuration the module requires is defined in each module. For example, for the Core module, there are the following configuration item definitions:

Copy code The code is as follows:

PHP_INI_BEGIN()
......
STD_PHP_INI_ENTRY_EX("display_errors", "1", PHP_INI_ALL, OnUpdateDisplayErrors, display_errors, php_core_globals, core_globals, display_errors_mode)
STD_PHP_INI_BOOLEAN("enable_dl", "1", PHP_INI_SYSTEM, OnUpdateBool, enable_dl, php_core_globals, core_globals)
STD_PHP_INI_BOOLEAN("expose_php", "1", PHP_INI_SYSTEM, OnUpdateBool, expose_php, php_core_globals, core_globals)
STD_PHP_INI_BOOLEAN("safe_mode", "0", PHP_INI_SYSTEM, OnUpdateBool, safe_mode, php_core_globals, core_globals)
......
PHP_INI_END()

The above code can be found in the php-src\main\main.c file at about line 450. There are many macros involved, including ZEND_INI_BEGIN, ZEND_INI_END, PHP_INI_ENTRY_EX, STD_PHP_INI_BOOLEAN, etc. This article will not go into details one by one. Interested readers can analyze them by themselves.

After macro expansion of the above code, we get:

Copy code The code is as follows:

static const zend_ini_entry ini_entries[] = {
    ..
    { 0, PHP_INI_ALL,    "display_errors",sizeof("display_errors"),OnUpdateDisplayErrors,(void *)XtOffsetOf(php_core_globals, display_errors), (void *)&core_globals, NULL, "1", sizeof("1")-1, NULL, 0, 0, 0, display_errors_mode },
    { 0, PHP_INI_SYSTEM, "enable_dl",     sizeof("enable_dl"),     OnUpdateBool,         (void *)XtOffsetOf(php_core_globals, enable_dl),      (void *)&core_globals, NULL, "1", sizeof("1")-1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
    { 0, PHP_INI_SYSTEM, "expose_php",    sizeof("expose_php"),    OnUpdateBool,         (void *)XtOffsetOf(php_core_globals, expose_php),     (void *)&core_globals, NULL, "1", sizeof("1")-1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
    { 0, PHP_INI_SYSTEM, "safe_mode",     sizeof("safe_mode"),     OnUpdateBool,         (void *)XtOffsetOf(php_core_globals, safe_mode),      (void *)&core_globals, NULL, "0", sizeof("0")-1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
    ...
    { 0, 0, NULL, 0, NULL, NULL, NULL, NULL, NULL, 0, NULL, 0, 0, 0, NULL }
};

我们看到,配置项的定义,其本质上就是定义了一个zend_ini_entry类型的数组。zend_ini_entry结构体的字段具体含义为:

复制代码 代码如下:

struct _zend_ini_entry {
int module_number; // module id
int modifiable; // Range that can be modified, such as php.ini, ini_set
char *name; char *name; // The name of the configuration item
uint name_length;
ZEND_INI_MH((*on_modify)); // Callback function, will be called when the configuration item is registered or modified
void *mh_arg1; // Usually the offset of the configuration item field in XXX_G
void *mh_arg2; // Usually XXX_G
void *mh_arg3; // Usually a reserved field, rarely used

char *value; // The value of the configuration item
uint value_length;

char *orig_value; // The original value of the configuration item
uint orig_value_length;
int orig_modifiable; // The original modifiable of the configuration item
int modified; //Whether it has been modified, if so, orig_value will save the value before modification

void (*displayer)(zend_ini_entry *ini_entry, int type);
};

2.3, apply configuration to module - REGISTER_INI_ENTRIES

REGISTER_INI_ENTRIES can often be seen in PHP_MINIT_FUNCTION of different extensions. REGISTER_INI_ENTRIES is mainly responsible for completing two things. First, filling the global space XXX_G of the module and synchronizing the value in configuration_hash to XXX_G. Secondly, it also generates EG(ini_directives).

REGISTER_INI_ENTRIES is also a macro, and after expansion it is actually the zend_register_ini_entries method. Let’s look specifically at the implementation of zend_register_ini_entries:

Copy code The code is as follows:

ZEND_API int zend_register_ini_entries(const zend_ini_entry *ini_entry, int module_number TSRMLS_DC) /* {{{ */
{
// ini_entry is an array of zend_ini_entry type, and p is a pointer to each item in the array
Const zend_ini_entry *p = ini_entry;
zend_ini_entry *hashed_ini_entry;
zval default_value;

// EG(ini_directives) is registered_zend_ini_directives
HashTable *directives = registered_zend_ini_directives;
zend_bool config_directive_success = 0;

//Remember that the last item of ini_entry is fixed to {0, 0, NULL, ...}
While (p->name) {
          config_directive_success = 0;
                                                                               
// Add the zend_ini_entry pointed to by p to EG(ini_directives)
If (zend_hash_add(directives, p->name, p->name_length, (void*)p, sizeof(zend_ini_entry), (void **) &hashed_ini_entry) == FAILURE) {
              zend_unregister_ini_entries(module_number TSRMLS_CC);
               return FAILURE;
         }
           hashed_ini_entry->module_number = module_number;
                                                                               
// Query configuration_hash based on name, and put the result in default_value
​​​​ // Note that the value of default_value is relatively primitive, usually a number, string, array, etc., depending on php.How to write in ini
If ((zend_get_configuration_directive(p->name, p->name_length, &default_value)) == SUCCESS) {
// Call on_modify to update to the global space XXX_G of the module
If (!hashed_ini_entry->on_modify || hashed_ini_entry->on_modify(hashed_ini_entry, Z_STRVAL(default_value), Z_STRLEN(default_value), hashed_ini_entry->mh_arg1, hashed_ini_entry-phpcngtphp cnmh_arg2, hashed_ini_entry->mh_arg3, ZEND_INI_STAGE_STARTUP TSRMLS_CC) == SUCCESS) {
                   hashed_ini_entry->value = Z_STRVAL(default_value);
                 hashed_ini_entry->value_length = Z_STRLEN(default_value);
Config_directive_success = 1;
             }
         }

// If not found in configuration_hash, the default value is used
If (!config_directive_success && hashed_ini_entry->on_modify) {
hashed_ini_entry->on_modify(hashed_ini_entry, hashed_ini_entry->value, hashed_ini_entry->value_length, hashed_ini_entry->mh_arg1, hashed_ini_entry->mh_arg2, hashed_ini_entry->mh_arg3, ZEND_INI_STAGE_STARTUP TSRMLS_CC);
         }
          p ;
}
Return SUCCESS;
}

To put it simply, the logic of the above code can be expressed as:

1. Add the ini configuration items declared by the module to EG (ini_directives). Note that the value of the ini configuration item may be modified later.

2. Try to find the ini required by each module in configuration_hash.

If it can be found, it means that this value is configured in the user's ini file, and the user's configuration is used.
If it is not found, OK, it doesn't matter, because the module will bring the default value when declaring ini.
3. Synchronize the value of ini to XX_G. After all, during the execution of php, these XXX_globals still play a role. The specific process is to call the on_modify method corresponding to each ini configuration. on_modify is specified by the module when declaring the ini.

Let’s take a closer look at on_modify, which is actually a function pointer. Let’s look at the configuration statements of two specific Core modules:

Copy code The code is as follows:

STD_PHP_INI_BOOLEAN("log_errors", "0", PHP_INI_ALL, OnUpdateBool, log_errors, php_core_globals, core_globals)
STD_PHP_INI_ENTRY("log_errors_max_len","1024", PHP_INI_ALL, OnUpdateLong, log_errors_max_len, php_core_globals, core_globals)

For log_errors, its on_modify is set to OnUpdateBool, and for log_errors_max_len, its on_modify is set to OnUpdateLong.

Further assume that our configuration in php.ini is:

Copy code The code is as follows:

log_errors = On
log_errors_max_len = 1024

Let’s take a closer look at the OnUpdateBool function:

Copy code The code is as follows:

ZEND_API ZEND_INI_MH(OnUpdateBool)
{
zend_bool *p;

// base represents the address of core_globals
char *base = (char *) mh_arg2;

// p represents the address of core_globals plus the offset of the log_errors field
//The obtained address is the address of the log_errors field
p = (zend_bool *) (base (size_t) mh_arg1);

if (new_value_length == 2 && strcasecmp("on", new_value) == 0) {
*p = (zend_bool) 1;
}
else if (new_value_length == 3 && strcasecmp("yes", new_value) == 0) {
*p = (zend_bool) 1;
}
else if (new_value_length == 4 && strcasecmp("true", new_value) == 0) {
*p = (zend_bool) 1;
}
else {
​​​​ //The value stored in configuration_hash is the string "1", not "On"
// So here we use atoi to convert it into the number 1
*p = (zend_bool) atoi(new_value);
}
Return SUCCESS;
}

The most puzzling ones are probably mh_arg1 and mh_arg2. In fact, compared with the zend_ini_entry definition mentioned above, mh_arg1 and mh_arg2 are still easy to understand. mh_arg1 represents the byte offset, mh_arg2 represents the address of XXX_globals. Therefore, the result of (char *)mh_arg2 mh_arg1 is the address of a field in XXX_globals. Specifically in this case, it is to calculate the address of log_errors in core_globals. Therefore, when OnUpdateBool is finally executed

Copy code The code is as follows:

*p = (zend_bool) atoi(new_value);

Its function is equivalent to

Copy code The code is as follows:

core_globals.log_errors = (zend_bool) atoi("1");

After analyzing OnUpdateBool, let’s look at OnUpdateLong and it will be clear at a glance:

Copy code The code is as follows:

ZEND_API ZEND_INI_MH(OnUpdateLong)
{
long *p;
char *base = (char *) mh_arg2;

// Get the address of log_errors_max_len
p = (long *) (base (size_t) mh_arg1);

// Convert "1024" into long type and assign it to core_globals.log_errors_max_len
*p = zend_atol(new_value, new_value_length);
Return SUCCESS;
}

Finally, it should be noted that in the zend_register_ini_entries function, if there is a configuration in the configuration_hash, the value and value_length in the hashed_ini_entry will be updated when on_modify is called. In other words, if the user has configured it in php.ini, EG (ini_directives) stores the actual configured value. If the user is not configured, EG (ini_directives) stores the default value given when declaring zend_ini_entry.

The default_value variable in zend_register_ini_entries is poorly named and can easily cause misunderstanding. In fact, default_value does not represent the default value, but the value actually configured by the user.

3. Summary

At this point, the three pieces of data configuration_hash, EG (ini_directives) and PG, BG, PCRE_G, JSON_G, XXX_G... have all been explained clearly.

To summarize:

1, configuration_hash, stores the configuration in the php.ini file, does not perform verification, and its value is a string.
2. EG (ini_directives) stores the zend_ini_entry defined in each module. If the user has configured it in php.ini (existing in configuration_hash), the value is replaced by the value in configuration_hash, and the type is still a string.
3. XXX_G, this macro is used to access the global space of the module. This memory space can be used to store ini configuration and be updated through the function specified by on_modify. Its data type is determined by the field declaration in XXX_G.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn