Home  >  Article  >  Backend Development  >  Changes brought to PHP7 by the new Abstract Syntax Tree (AST)

Changes brought to PHP7 by the new Abstract Syntax Tree (AST)

Guanhui
Guanhuiforward
2020-05-14 11:12:072983browse

Changes brought to PHP7 by the new Abstract Syntax Tree (AST)

Most of the content of this article is based on the RFC document of AST: https://wiki.php.net/rfc/abstract_syntax_tree, Excerpts from the source document are introduced for ease of understanding.

This article will not tell you what an abstract syntax tree is. This needs to be understood by yourself. This article only describes some changes that AST brings to PHP.

New execution process

An important change in the core of PHP7 is the addition of AST. In PHP5, the execution process from php scripts to opcodes is:

1. Lexing: lexical scanning analysis, converting source files into token streams;

2. Parsing: syntax analysis, Op arrays are generated at this stage.

3. In PHP7, op arrays are no longer directly generated during the syntax analysis stage, but AST is generated first, so there is one more step in the process:

4. Lexing: lexical scanning analysis, converting the source file into Convert to token stream;

5. Parsing: syntax analysis, generate abstract syntax tree from token stream;

6. Compilation: generate op arrays from abstract syntax tree.

Execution time and memory consumption

From the above steps, this is one more step than the previous process, so according to common sense, this will increase the program execution time and memory usage. But in fact, the memory usage has indeed increased, but the execution time has decreased.

The following results are obtained by testing three scripts: small (about 100 lines of code), medium (about 700 lines), and large (about 2800 lines). Test script: https ://gist.github.com/nikic/289b0c7538b46c2220bc

Execution time of compiling each file 100 times (note that the test result time of the article is 14 years, PHP7 is also called PHP-NG ):

##php-ngphp-astdiffSMALL0.180s0.160s-12.5%MEDIUM1.492s1.268s-17.7%LARGE6.703s 5.736s-16.9%

Memory peak in a single compilation:

php-ngphp-astdiffSMALL378kB414kB 9.5%##MEDIUMLARGEThe test results of a single compilation may not represent actual usage. The following are the results of a complete project test using PhpParser:

507kB 643kB 26.8%
1084kB 1857kB 71.3%

TIME##MEMORY2360kB2482kB 5.1%

Tests show that after using AST, the overall execution time of the program is improved by about 10% to 15%, but the memory consumption also increases. The increase is obvious in a single compilation of large files, but not during the entire project execution process. Very serious problem.

Also note that the above results are all without Opcache. When Opcache is turned on in a production environment, the increase in memory consumption is not a big problem.

Semantic changes

If it is just a time optimization, it does not seem to be a sufficient reason to use AST. In fact, the implementation of AST is not based on time optimization considerations, but to solve syntax problems. Let’s take a look at some changes in semantics.

yield does not require parentheses

In the PHP5 implementation, if you use yield in an expression context (such as on the right side of an assignment expression), you You must use parentheses on both sides of the yield declaration:

<?php
$result = yield fn();   // 不合法的
$result = (yield fn()); // 合法的

This behavior is only due to the implementation limitations of PHP5. In PHP7, parentheses are no longer necessary. Therefore, the following writing methods are also legal:

<?php
$result = yield;
$result = yield $v;
$result = yield $k => $v;

Of course, you must follow the application scenarios of yield.

Brackets do not affect behavior

In PHP5, ($foo)['bar'] = 'baz' and $foo['bar'] = 'baz ' The meanings of the two statements are different. In fact, the former way of writing is illegal, and you will get the following error:

<?php
($foo)[&#39;bar&#39;] = &#39;baz&#39;;
# PHP Parse error: Syntax error, unexpected &#39;[&#39; on line 1

But in PHP7, the two ways of writing mean the same thing.

Similarly, if the parameters of the function are wrapped in parentheses, there is a problem with type checking. This problem has also been solved in PHP7:

<?php
function func() {
    return [];
}

function byRef(array &$a) {
}

byRef((func()));

The above code will not alarm in PHP5 unless byRef is used (func()), but in PHP7, the following error will occur regardless of whether there are parentheses on both sides of func():

PHP Strict standards: Only variables should be passed by reference ...

Changes in list()

## The behavior of the #list keyword has changed a lot. The order in which list assigns values ​​to variables (the order on the left and right of the equal sign at the same time) used to be from right to left, but now it is from left to right:

<?php
list($array[], $array[], $array[]) = [1, 2, 3];
var_dump($array);

// PHP5: $array = [3, 2, 1]
// PHP7: $array = [1, 2, 3]

# 注意这里的左右的顺序指的是等号左右同时的顺序,
# list($a, $b) = [1, 2] 这种使用中 $a == 1, $b == 2 是没有疑问的。

The reason for the above change is precisely because in the assignment process of PHP5, 3 will be filled into the array first, 1 last, but now the order has changed.

The same changes are:

<?php
$a = [1, 2];
list($a, $b) = $a;

// PHP5: $a = 1, $b = 2
// PHP7: $a = 1, $b = null + "Undefined index 1"

This is because in the previous assignment process, $b first got 2, and then the value of $a became 1, but now $a changes first It becomes 1 and is no longer an array, so $b becomes null.

list is now accessed only once per offset:

<?php
list(list($a, $b)) = $array;

// PHP5:
$b = $array[0][1];
$a = $array[0][0];

// PHP7:
// 会产生一个中间变量,得到 $array[0] 的值
$_tmp = $array[0];
$a = $_tmp[0];
$b = $_tmp[1];

Empty list members are now all prohibited, previously only under certain circumstances:

<?php
list() = $a;           // 不合法
list($b, list()) = $a; // 不合法
foreach ($a as list()) // 不合法 (PHP5 中也不合法)

The order of reference assignment

The order of reference assignment is from right to left in PHP5, and now from left to right:

<?php
$obj = new stdClass;
$obj->a = &$obj->b;
$obj->b = 1;
var_dump($obj);

// PHP5:
object(stdClass)#1 (2) {
  ["b"] => &int(1)
  ["a"] => &int(1)
}

// PHP7:
object(stdClass)#1 (2) {
  ["a"] => &int(1)
  ["b"] => &int(1)
}

__clone method can Direct call

Now you can directly use $obj->__clone() to call the __clone method. __clone was the only magic method that was previously prohibited from being called directly. Previously you would get an error like this:

Fatal error: Cannot call __clone() method on objects - use &#39;clone $obj&#39; instead in ...

Variable syntax consistency

AST also solved some Syntax consistency issues, these issues were raised in another RFC:

https://wiki.php.net/rfc/uniform_variable_syntax.

In the new implementation, the meanings of some previous grammatical expressions are somewhat different from those now. For details, please refer to the following table:


php-ng
php-ast diff
25.5ms 22.8ms -11.8%
ExpressionPHP5PHP7##$foo->{$bar['baz']} ($foo->$bar)['baz']##$foo->$ bar['baz']()Recommended tutorial: "PHP7
$$foo['bar']['baz'] ${$foo['bar']['baz']} ($$foo)['bar']['baz']
$foo->$bar['baz']
$foo->{$bar['baz']}() ##($ foo->$bar)['baz']() ##Foo::$bar['baz']()
Foo::{$bar['baz']}() (Foo::$bar)['baz']() On the whole, the previous order is from right to left, now it is from left to right, and it also follows the principle that brackets do not affect behavior. These complex variable writing methods need to be paid attention to in actual development.
"

The above is the detailed content of Changes brought to PHP7 by the new Abstract Syntax Tree (AST). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete