Home  >  Article  >  Backend Development  >  In-depth analysis of foreach issues in php_PHP tutorial

In-depth analysis of foreach issues in php_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:01:34780browse

Foreword:
The foreach structure was introduced in php4, which is a simple way to traverse an array. Compared with the traditional for loop, foreach can obtain key-value pairs more conveniently. Before php5, foreach could only be used for arrays; after php5, foreach could also be used to traverse objects (see: Traversing Objects for details). This article only discusses array traversal.

Although foreach is simple, it may have some unexpected behavior, especially when the code involves references.
A few cases are listed below to help us further understand the nature of foreach.
Question 1:

Copy code The code is as follows:

$arr = array(1 ,2,3);
foreach($arr as $k => &$v) {
$v = $v * 2;
}
// now $arr is array( 2, 4, 6)
foreach($arr as $k => $v) {
echo "$k", " => ", "$v";
}

Let’s start simple. If we try to run the above code, we will find that the final output is 0=>2 1=>4 2=>4 .
Why not 0=>2 1=>4 2=>6?
In fact, we can think that the foreach($arr as $k => $v) structure implies the following operations, assigning the current 'key' and current 'value' of the array to variables $k and $v respectively. . The specific expansion is as follows:
Copy code The code is as follows:

foreach($arr as $k => $v) {
//Two assignment operations are implicit before the user code is executed
$v = currentVal();
$k = currentKey();
//Continue to run the user code
...
}

According to the above theory, now we re-analyze the first foreach:
The first loop, since $v is a reference, so $v = & $arr[0], $v=$v*2 is equivalent to $arr[0]*2, so $arr becomes 2,2,3
The second cycle, $v = &$arr[1] , $arr becomes 2,4,3
The third loop, $v = &$arr[2], $arr becomes 2,4,6
Then the code enters the second foreach:
In the first loop, the implicit operation $v=$arr[0] is triggered, because $v is still a reference to $arr[2] at this time, which is equivalent to $arr[2] =$arr[0], $arr becomes 2,4,2
The second loop, $v=$arr[1], that is, $arr[2]=$arr[1], $arr becomes 2,4,4
The third loop, $v=$arr[2], that is, $arr[2]=$arr[2], $arr becomes 2,4,4
OK, analysis complete.
How to solve similar problems? There is a reminder in the PHP manual:
Warning: The $value reference of the last element of the array will still be retained after the foreach loop. It is recommended to use unset() to destroy it.
Copy code The code is as follows:

$arr = array(1,2,3);
foreach($arr as $k => &$v) {
$v = $v * 2;
}
unset($v);
foreach($arr as $k = > $v) {
echo "$k", " => ", "$v";
}
// Output 0=>2 1=>4 2=> 6

From this question we can see that quoting is likely to be accompanied by side effects. If you don't want unintentional modifications to change the contents of the array, it is best to unset these references in time.
Question 2:
Copy code The code is as follows:

$arr = array(' a','b','c');
foreach($arr as $k => $v) {
echo key($arr), "=>", current($arr) ;
}
// Print 1=>b 1=>b 1=>b

This question is even weirder. According to the manual, key and current are the key values ​​of the current element in the array.
Then why key($arr) is always 1 and current($arr) is always b?
First use vld to check the compiled opcode:

We start from line 3 The ASSIGN directive looks like it means assigning array('a','b','c') to $arr.
Since $arr is CV and array('a','b','c') is TMP, the function actually executed by the ASSIGN instruction found is ZEND_ASSIGN_SPEC_CV_TMP_HANDLER. It should be pointed out here that CV is a variable cache added after PHP5.1. It uses an array to save zval**. When the cached variables are used again, there is no need to search the active symbol table, but directly go to CV. Obtained from the array, since the access speed of the array is much faster than that of the hash table, the efficiency can be improved.
Copy code The code is as follows:

static int ZEND_FASTCALL ZEND_ASSIGN_SPEC_CV_TMP_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);
zend_free_op free_op2;
zval *value = _get _zval_ptr_tmp(&opline->op2, EX(Ts), &free_op2 TSRMLS_CC);

// Create $arr** pointer
in CV array zval **variable_ptr_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts), BP_VAR_W TSRMLS_CC );
if (IS_CV == IS_VAR && !variable_ptr_ptr) {
……
}
else {
variable_ptr_ptr, value, 1 TSRMLS_CC);
if (!RETURN_VALUE_UNUSED(&opline->result)) {
AI_SET_PTR(EX_T(opline->result.u.var).var, value);
P ZVAL_LOCK( value);
}
}
ZEND_VM_NEXT_OPCODE();
}

After the ASSIGN instruction is completed, the zval** pointer is added to the CV array, and the pointer points to the actual array, which means $arr has been cached by CV.

Next, perform the loop operation of the array. Let’s look at the FE_RESET instruction. Its corresponding execution function is
ZEND_FE_RESET_SPEC_CV_HANDLER:
Copy Code The code is as follows:
static int ZEND_FASTCALL ZEND_FE_RESET_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{

if (…) {
… ……
// Save the pointer to array to zend_execute_data->Ts (Ts is used to store temp_variable during code execution)
AI_SET_PTR(EX_T(opline->result.u.var) .var, array_ptr);
PZVAL_LOCK(array_ptr);
if (iter) {
……
} else if ((fe_ht = HASH_OF(array_ptr)) != NULL) {
//Reset the array internal pointer
zend_hash_internal_pointer_reset(fe_ht);
if (ce) {
(fe_ht) != SUCCESS;

             // Set EX_T(opline->result.u.var).fe.fe_pos to save the array internal pointer
          zend_hash_get_pointer(fe_ht, &EX_T(opline->result.u.var).fe.fe_pos) ;
} else {
 …
}
 …
}


Here, two important pointers are mainly stored in zend_execute_data->Ts Medium:

•EX_T(opline->result.u.var).var ---- pointer to array
•EX_T(opline->result.u.var).fe. fe_pos ---- pointer to the internal elements of array

After the FE_RESET instruction is executed, the actual situation in the memory is as follows:


Next we continue to look at FE_FETCH, its corresponding execution function is ZEND_FE_FETCH_SPEC_VAR_HANDLER:


Copy code

The code is as follows:


static int ZEND_FASTCALL ZEND_FE_FETCH_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);

// Note that the pointer is from EX_T(opline->op1.u
obtained by .var).var.ptr zval *array = EX_T(opline->op1.u.var).var.ptr;
……

switch (zend_iterator_unwrap(array, &iter TSRMLS_CC)) {
                                                                                                                               > }
case ZEND_ITER_PLAIN_ARRAY:
fe_ht = HASH_OF( array);

                                                                                                                                          . / Get the pointer here
             zend_hash_set_pointer(fe_ht, &EX_T(opline->op1.u.var).fe.fe_pos);                                                                       _data( fe_ht, (void **) &value)==FAILURE) {
                                                                                                                      key) {
             key_type = zend_hash_get_current_key_ex(fe_ht, &str_key, &str_key_len, &int_key, 1, NULL);                                                                                                                                                                                🎜> zend_hash_move_forward(fe_ht);
// The pointer after moving is saved to ex_t (opline-& gt; op1.u.var) .fe.fe_pos
Zend_hash_get_pointer (Fe_HT, & EX_T (OPLINE- & GT; OP1.var) .fe .fe_pos);
                                                                                              Implementation of FE_FETCH, We roughly understand what foreach($arr as $k => $v) does. It will obtain the array element based on the pointer of zend_execute_data->Ts. After the acquisition is successful, it will move the pointer to the next position and save it again.




To put it simply, since the internal pointer of the array has been moved to the second element in FE_FETCH in the first loop, when key($arr) and current($arr) are called inside foreach, the actual obtained It's 1 and 'b'.

Then why is 1=>b output three times?

Let’s continue to look at the SEND_REF instructions on lines 9 and 13, which means pushing the $arr parameter onto the stack. Then the DO_FCALL instruction is generally used to call the key and current functions. PHP is not compiled into native machine code, so PHP uses such opcode instructions to simulate how the actual CPU and memory work.

Check SEND_REF in the PHP source code:




Copy the code

The code is as follows:

static int ZEND_FASTCALL ZEND_SEND_REF_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{

// Get the pointer of $arr pointer from CV
varptr_ptr = _get_zval_ptr_ptr_ cv(&opline->op1, EX (Ts), BP_VAR_W TSRMLS_CC);


// Variable separation, here is a new copy of the array specifically for the key function
SEPARATE_ZVAL_TO_MAKE_IS_REF(varptr_ptr);
varptr = *varptr_ptr;
Z_ADDREF_P(varptr);

// Push the stack
zend_vm_stack_push(varptr TSRMLS_CC);
ZEND_VM_NEXT_OPCODE();
}

SEPARATE_ZVAL_TO_MAKE_IS_REF in the above code is a macro:
Copy code The code is as follows:

#define SEPARATE_ZVAL_TO_MAKE_IS_REF(ppzv)
if (!PZVAL_IS_REF(*ppzv)) {                                                                 🎜> }


The main function of SEPARATE_ZVAL_TO_MAKE_IS_REF is that if the variable is not A reference is copied to a new one in memory. In this example, it copies array('a','b','c'). Therefore, the memory after variable separation is:
Note that after the variable separation is completed, the pointer in the CV array points to the newly copied data, and the old one can still be obtained through the pointer in zend_execute_data->Ts data.
The following loops will not be described one by one. Combined with the above picture:

•The foreach structure uses the blue array below, which will traverse a and b in sequence. , c
·key and current use the yellow array above, and its internal pointer always points to b
. At this point we understand why key and current always return the second element of the array. Since there is no external The code acts on the copied array, and its internal pointer will never move.

Question 3:


Copy code
The code is as follows:$arr = array(' a','b','c');
foreach($arr as $k => &$v) {
echo key($arr), '=>', current($arr );
}// Print 1=>b 2=>c =>



There is only one difference between this question and question 2: the foreach in this question uses a reference .
Use VLD to check this question and find that the opcode compiled from the code in question 2 is the same. Therefore, we use the tracking method of question 2 to gradually check the corresponding implementation of opcode.
First foreach will call FE_RESET:

Copy code
The code is as follows:

static int ZEND_FASTCALL ZEND_FE_RESET_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
……
if (opline->extended_value & ZEND_FE_RESET_VARIABLE) {
// from Get variable
in CV array_ptr_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts), BP_VAR_R TSRMLS_CC);
                                                                     🎜> }
else if (Z_TYPE_PP(array_ptr_ptr) == IS_OBJECT) {
                                                                                            _TYPE_PP(array_ptr_ptr) == IS_ARRAY) {
          SEPARATE_ZVAL_IF_NOT_REF(array_ptr_ptr);
                                                                                           Save the zval of the array and set it to is_ref
                                                                                                                                                                                                 Z_SET_ }
              array_ptr = *array_ptr_ptr;                                                        🎜> …
}



Part of the implementation of FE_RESET has been analyzed in question 2.
Special attention needs to be paid here. In this example, foreach uses a reference to obtain the value, so during execution, FE_RESET will enter another branch different from the previous question.
Finally, FE_RESET will set the is_ref of the array to true, and at this time there is only one copy of the array data in the memory.

Next analyze SEND_REF:




Copy the code

The code is as follows:

static int ZEND_FASTCALL ZEND_SEND_REF_SPEC_CV_HANDLER (ZEND_OPCODE_HANDLER_ARGS) {
// Get the pointer of the $ ARR pointer from the CV
varptr_ptr = _Get_zval_ptr_cv TSRMLS_CC); ……
// Variable separation, since the variable in CV itself is a reference at this time, a new array will not be copied here
SEPARATE_ZVAL_TO_MAKE_IS_REF(varptr_ptr); varptr = * varptr_ptr; Z_ADDREF_P(varptr); // Push the stack
zend_vm_stack_push(varptr TSRMLS_CC);
ZEND_VM_NEXT_OPCODE();
}


Macro SEPARATE_ZVAL_TO_MAKE_IS_REF only separates variables with is_ref=false. Since array has been set to is_ref=true before, it will not be copied. In other words, there is still only one copy of array data in the memory at this time.




The above picture explains why the first two loops output 1=>b 2=>C. During the third cycle of FE_FETCH, continue to move the pointer forward.



Copy code

The code is as follows:

ZEND_API int zend_hash_move_forward_ex(HashTable *ht, HashPosition *pos)
{ HashPosition *current = pos ? pos : &ht->pInternalPointer;

IS_CONSISTENT(ht); if (*current) {

*current = (*current)->pListNext;

return SUCCESS;
} else

return FAILURE;}
Since the internal pointer already points to the last element of the array at this time, moving forward will point to NULL. After pointing the internal pointer to NULL, we then call key and current on the array, and NULL and false will be returned respectively, indicating that the call failed. At this time, no characters will be echoed.
Question 4:
Copy code The code is as follows:

$arr = array(1 , 2, 3);
$tmp = $arr;
foreach($tmp as $k => &$v){
$v *= 2;
}
var_dump ($arr, $tmp); // What to print?

This question has little to do with foreach, but since it involves foreach, let’s discuss it together:)
In the code, the array $arr is first created, and then the array is assigned to $tmp, in the next foreach loop, modifying $v will affect the array $tmp, but it will not affect $arr.
Why?
This is because in php, the assignment operation copies the value of one variable to another variable, so modifying one of them will not affect the other.
Digression: This does not apply to the object type. Starting from PHP5, objects are always assigned by reference by default. For example:
Copy code The code is as follows:

class A{
public $foo = 1;
}
$a1 = $a2 = new A;
$a1-> ;foo=100;
echo $a2->foo; // Output 100, $a1 and $a2 are actually references to the same object

Go back to the code in the question, now We can confirm that $tmp=$arr is actually a value copy, and the entire $arr array will be copied to $tmp. Theoretically, after the assignment statement is executed, there will be two copies of the same array in memory.
Some students may wonder, if the array is large, wouldn’t this operation be very slow?
Fortunately, php has a smarter way to handle it. In fact, after $tmp=$arr is executed, there is still only one array in the memory. View the zend_assign_to_variable implementation in the php source code (extracted from php5.3.26):
Copy the code The code is as follows:

static inline zval* zend_assign_to_variable(zval **variable_ptr_ptr, zval *value, int is_tmp_var TSRMLS_DC)
{
zval *variable_ptr = *variable_ptr_ptr;
zval garbage;
……
// The lvalue is of object type
if (Z_TYPE_P(variable_ptr) == IS_OBJECT && Z_OBJ_HANDLER_P(variable_ptr, set)) {
……
}
// The lvalue is a reference Case
if (PZVAL_IS_REF(variable_ptr)) {
……
} else {
// case of lvalue refcount__gc=1
if (Z_DELREF_P(variable_ptr)==0) { If (PZVAL_IS_REF(value) && Z_REFCOUNT_P(value) > 0) {
ALLOC_ZVAL(variable_ptr);
*variable_ptr_ptr = variable_ptr;
*variable_ptr = *value;
                                                                                                                                                                                                                            Z_SET_REFCOUNT_P(variable_ptr, 1); >} Else {
// $ TMP = $ ARR will run here. / It just copies the pointer, and does not actually copy the actual array
*variable_ptr_ptr = value;
               Z_ADDREF_P(value );
                                                       F_PP(variable_ptr_ptr);
}
return *variable_ptr_ptr;
}


It can be seen that the essence of $tmp = $arr is to copy the array pointer, and then automatically increase the refcount of the array by 1. Use a diagram to express the memory at this time, there is still only one array array:


Since there is only one array, when $tmp is modified in the foreach loop, why does $arr not change accordingly?
Continue to look at the ZEND_FE_RESET_SPEC_CV_HANDLER function in the PHP source code. This is an OPCODE HANDLER, and its corresponding OPCODE is FE_RESET. This function is responsible for setting the array's internal pointer to its first element before foreach begins.



Copy code

The code is as follows:

static int ZEND_FASTCALL  ZEND_FE_RESET_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
    zend_op *opline = EX(opline);
    zval *array_ptr, **array_ptr_ptr;
    HashTable *fe_ht;
    zend_object_iterator *iter = NULL;
    zend_class_entry *ce = NULL;
    zend_bool is_empty = 0;
    // 对变量进行FE_RESET
    if (opline->extended_value & ZEND_FE_RESET_VARIABLE) {
        array_ptr_ptr = _get_zval_ptr_ptr_cv(&opline->op1, EX(Ts), BP_VAR_R TSRMLS_CC);
        if (array_ptr_ptr == NULL || array_ptr_ptr == &EG(uninitialized_zval_ptr)) {
            ……
        }
        // foreach一个object
        else if (Z_TYPE_PP(array_ptr_ptr) == IS_OBJECT) {
            ……
        }
        else {
            // 本例会进入该分支
            if (Z_TYPE_PP(array_ptr_ptr) == IS_ARRAY) {
                // 注意此处的SEPARATE_ZVAL_IF_NOT_REF
                // 它会重新复制一个数组出来
                // 真正分离$tmp和$arr,变成了内存中的2个数组
                SEPARATE_ZVAL_IF_NOT_REF(array_ptr_ptr);
                if (opline->extended_value & ZEND_FE_FETCH_BYREF) {
                    Z_SET_ISREF_PP(array_ptr_ptr);
                }
            }
            array_ptr = *array_ptr_ptr;
            Z_ADDREF_P(array_ptr);
        }
    } else {
        ……
    }

    // 重置数组内部指针
    ……
}

从代码中可以看出,真正执行变量分离并不是在赋值语句执行的时候,而是推迟到了使用变量的时候,这也是Copy On Write机制在PHP中的实现。
FE_RESET之后,内存的变化如下:


上图解释了为何foreach并不会对原来的$arr产生影响。至于ref_count以及is_ref的变化情况,感兴趣的同学可以详细阅读ZEND_FE_RESET_SPEC_CV_HANDLER和ZEND_SWITCH_FREE_SPEC_VAR_HANDLER的具体实现(均位于php-src/zend/zend_vm_execute.h中),本文不做详细剖析:)

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/327985.htmlTechArticle前言: php4中引入了foreach结构,这是一种遍历数组的简单方式。相比传统的for循环,foreach能够更加便捷的获取键值对。在php5之前,foreach仅...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn