search
HomeBackend DevelopmentPHP TutorialAnalyzing the Zend VM engine from PHP syntactic sugar

1.

先说个PHP5.3+ 的语法糖,通常我们这样写:

<?php
    $a = 0;
    $b = $a ? $a : 1;

语法糖可以这样写:

<?php
    $a = 0;
    $b = $a ?: 1;

执行结果$b = 1,后面写法更简洁,但通常不太建议用太多语法糖,特别是容易理解混淆的,比如PHP 7 新增加??如下:

<?php
    $b = $a ?? 1;

相当于:

<?php
    $b = isset($a) ? $a : 1;

?: 和 ?? 你是不是容易搞混,如果这样,我建议宁可不用,代码可读性强,易维护更重要。

语法糖不是本文的重点,我们的目的是从语法糖入手聊聊Zend VM的解析原理。

2.

分析的PHP源码分支 => remotes/origin/PHP-5.6.14,关于如何通过vld查看opcode,请看我之前写的这篇文章:
http://www.yinqisen.cn/blog-680.html

<?php
    $a = 0;
    $b = $a ?: 1;

对应的opcdoe如下:

number of ops:  5compiled vars:  !0 = $a, !1 = $bline     #* E I O op                           fetch          ext  return  operands-------------------------------------------------------------------------------------   2     0  E >   ASSIGN                                                   !0, 0
   3     1        JMP_SET_VAR                                      $1      !0
         2        QM_ASSIGN_VAR                                    $1      1
         3        ASSIGN                                                   !1, $1
   4     4      > RETURN                                                   1branch: #  0; line:     2-    4; sop:     0; eop:     4; out1:  -2path #1: 0,

vim Zend/zend_language_parser.y +834

834 ›   |›  expr &#39;?&#39; &#39;:&#39; { zend_do_jmp_set(&$1, &$2, &$3 TSRMLS_CC); }
835 ›   ›   expr     { zend_do_jmp_set_else(&$$, &$5, &$2, &$3 TSRMLS_CC); }

如果你喜欢,可以自己动手,重新定义 ?: 的语法糖。遵循BNF文法规则,使用bison解析,有兴趣可以自行Google相关知识,继续深入了解。

从vld的opcode可以知道,执行了 zend_do_jmp_set_else,代码在 Zend/zend_compile.c 中:

void zend_do_jmp_set_else(znode *result, const znode *false_value, const znode *jmp_token, const znode *colon_token TSRMLS_DC)
{
›   zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC);

›   SET_NODE(opline->result, colon_token);
›   if (colon_token->op_type == IS_TMP_VAR) {
›   ›   if (false_value->op_type == IS_VAR || false_value->op_type == IS_CV) {
›   ›   ›   CG(active_op_array)->opcodes[jmp_token->u.op.opline_num].opcode = ZEND_JMP_SET_VAR;
›   ›   ›   CG(active_op_array)->opcodes[jmp_token->u.op.opline_num].result_type = IS_VAR;
›   ›   ›   opline->opcode = ZEND_QM_ASSIGN_VAR;
›   ›   ›   opline->result_type = IS_VAR;
›   ›   } else {
›   ›   ›   opline->opcode = ZEND_QM_ASSIGN;
›   ›   }
›   } else {
›   ›   opline->opcode = ZEND_QM_ASSIGN_VAR;
›   }
›   opline->extended_value = 0;
›   SET_NODE(opline->op1, false_value);
›   SET_UNUSED(opline->op2);

›   GET_NODE(result, opline->result);

›   CG(active_op_array)->opcodes[jmp_token->u.op.opline_num].op2.opline_num = get_next_op_number(CG(active_op_array));

›   DEC_BPC(CG(active_op_array));
}

3.

重点两个opcode,ZEND_JMP_SET_VAR 和 ZEND_QM_ASSIGN_VAR,怎么接着读代码呢?下面说下PHP的opcode。

PHP5.6有167个opcode,意味着可以执行167种不同的计算操作,官方文档看这里http://php.net/manual/en/internals2.opcodes.list.php

PHP内部使用_zend_op 这个结构体来表示opcode, vim Zend/zend_compile.h +111

111 struct _zend_op {
112 ›   opcode_handler_t handler;
113 ›   znode_op op1;
114 ›   znode_op op2;
115 ›   znode_op result;
116 ›   ulong extended_value;
117 ›   uint lineno;
118 ›   zend_uchar opcode;
119 ›   zend_uchar op1_type;
120 ›   zend_uchar op2_type;
121 ›   zend_uchar result_type;
122 }

PHP 7.0略有不同,主要区别在针对64位系统 uint换成uint32_t,明确指定字节数。

你把opcode当成一个计算器,只接受两个操作数(op1, op2),执行一个操作(handler, 比如加减乘除),然后它返回一个结果(result)给你,再稍加处理算术溢出的情况(extended_value)。

Zend的VM对每个opcode的工作方式完全相同,都有一个handler(函数指针),指向处理函数的地址。这是一个C函数,包含了执行opcode对应的代码,使用op1,op2做为参数,执行完成后,会返回一个结果(result),有时也会附加一段信息(extended_value)。

用我们例子中的操作数 ZEND_JMP_SET_VAR 说明,vim Zend/zend_vm_def.h +4995

4942 ZEND_VM_HANDLER(158, ZEND_JMP_SET_VAR, CONST|TMP|VAR|CV, ANY)

4942 ZEND_VM_HANDLER(158, ZEND_JMP_SET_VAR, CONST|TMP|VAR|CV, ANY)
4943 {
4944 ›   USE_OPLINE
4945 ›   zend_free_op free_op1;
4946 ›   zval *value, *ret;
4947
4948 ›   SAVE_OPLINE();
4949 ›   value = GET_OP1_ZVAL_PTR(BP_VAR_R);
4950
4951 ›   if (i_zend_is_true(value)) {
4952 ›   ›   if (OP1_TYPE == IS_VAR || OP1_TYPE == IS_CV) {
4953 ›   ›   ›   Z_ADDREF_P(value);
4954 ›   ›   ›   EX_T(opline->result.var).var.ptr = value;
4955 ›   ›   ›   EX_T(opline->result.var).var.ptr_ptr = &EX_T(opline->result.var).var.ptr;
4956 ›   ›   } else {
4957 ›   ›   ›   ALLOC_ZVAL(ret);
4958 ›   ›   ›   INIT_PZVAL_COPY(ret, value);
4959 ›   ›   ›   EX_T(opline->result.var).var.ptr = ret;
4960 ›   ›   ›   EX_T(opline->result.var).var.ptr_ptr = &EX_T(opline->result.var).var.ptr;
4961 ›   ›   ›   if (!IS_OP1_TMP_FREE()) {
4962 ›   ›   ›   ›   zval_copy_ctor(EX_T(opline->result.var).var.ptr);
4963 ›   ›   ›   }
4964 ›   ›   }
4965 ›   ›   FREE_OP1_IF_VAR();
4966 #if DEBUG_ZEND>=2
4967 ›   ›   printf("Conditional jmp to %d\n", opline->op2.opline_num);
4968 #endif
4969 ›   ›   ZEND_VM_JMP(opline->op2.jmp_addr);
4970 ›   }
4971
4972 ›   FREE_OP1();
4973 ›   CHECK_EXCEPTION();
4974 ›   ZEND_VM_NEXT_OPCODE();
4975 }

i_zend_is_true 来判断操作数是否为true,所以ZEND_JMP_SET_VAR是一种条件赋值,相信大家都能看明白,下面讲重点。

注意zend_vm_def.h这并不是一个可以直接编译的C的头文件,只能说是一个模板,具体可编译的头为zend_vm_execute.h(这个文件可有45000多行哦),它并非手动生成,而是由zend_vm_gen.php这个PHP脚本解析zend_vm_def.h后生成(有意思吧,先有鸡还是先有蛋,没有PHP 哪来的这个脚本?),猜测这个是后期产物,早期php版本应该不会用这个。

上面ZEND_JMP_SET_VAR的代码,根据不同参数 CONST|TMP|VAR|CV 最终会生成不同类型的,但功能一致的handler函数:

static int ZEND_FASTCALL  ZEND_JMP_SET_VAR_SPEC_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
static int ZEND_FASTCALL  ZEND_JMP_SET_VAR_SPEC_TMP_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
static int ZEND_FASTCALL  ZEND_JMP_SET_VAR_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
static int ZEND_FASTCALL  ZEND_JMP_SET_VAR_SPEC_CV_HANDLER(ZEND_OPCODE_HANDLER_ARGS)

这么做的目的是为了在编译期确定handler,提升运行期的性能。不这么做,在运行期根据参数类型选择,也可以做到,但性能不好。当然这么做有时也会生成一些垃圾代码(看似无用),不用担心,C的编译器会进一步优化处理。

zend_vm_gen.php 也可以接受一些参数,细节在PHP源码中的README文件 Zend/README.ZEND_VM 有详细说明。

4.

讲到这里,我们知道opcode怎么和handler对应了。但是在整体上还有一个过程,就是语法解析,解析后所有的opcode是怎么串联起来的呢?

语法解析的细节就不说了,解析过后,会有个包含所有opcode的大数组(说链表可能更准确),从上面代码我们可以看到,每个handler执行完后,都会调用 ZEND_VM_NEXT_OPCODE(),取出下一个opcode,继续执行,直到最后退出,循环的代码 vim Zend/zend_vm_execute.h +337:

ZEND_API void execute_ex(zend_execute_data *execute_data TSRMLS_DC){
›   DCL_OPLINE
›   zend_bool original_in_execution;



›   original_in_execution = EG(in_execution);
›   EG(in_execution) = 1;

›   if (0) {
zend_vm_enter:
›   ›   execute_data = i_create_execute_data_from_op_array(EG(active_op_array), 1 TSRMLS_CC);
›   }

›   LOAD_REGS();
›   LOAD_OPLINE();

›   while (1) {
    ›   int ret;#ifdef ZEND_WIN32›   ›   if (EG(timed_out)) {
›   ›   ›   zend_timeout(0);
›   ›   }#endif›   ›   if ((ret = OPLINE->handler(execute_data TSRMLS_CC)) > 0) {
›   ›   ›   switch (ret) {
›   ›   ›   ›   case 1:
›   ›   ›   ›   ›   EG(in_execution) = original_in_execution;
›   ›   ›   ›   ›   return;
›   ›   ›   ›   case 2:
›   ›   ›   ›   ›   goto zend_vm_enter;
›   ›   ›   ›   ›   break;
›   ›   ›   ›   case 3:
›   ›   ›   ›   ›   execute_data = EG(current_execute_data);
›   ›   ›   ›   ›   break;
›   ›   ›   ›   default:
›   ›   ›   ›   ›   break;
›   ›   ›   }
›   ›   }

›   }
›   zend_error_noreturn(E_ERROR, "Arrived at end of main loop which shouldn&#39;t happen");
}

宏定义, vim Zend/zend_execute.c +1772

1772 #define ZEND_VM_NEXT_OPCODE() \
1773 ›   CHECK_SYMBOL_TABLES() \
1774 ›   ZEND_VM_INC_OPCODE(); \
1775 ›   ZEND_VM_CONTINUE()329 #define ZEND_VM_CONTINUE()         return 0330 #define ZEND_VM_RETURN()           return 1331 #define ZEND_VM_ENTER()            return 2332 #define ZEND_VM_LEAVE()            return 3

while是一个死循环,执行一个handler函数,除个别情况,多数handler函数末尾都调用ZEND_VM_NEXT_OPCODE() -> ZEND_VM_CONTINUE(),return 0,继续循环。

注:比如 yield 协程是个例外,它会返回1,直接return出循环。以后有机会我们再单独对yield做分析。

希望你看完上面内容,对PHP Zend 引擎的解析过程有个详细的了解,下面我们基于原理的分析,再简单聊聊PHP的优化。

5. PHP优化注意事项

5.1 echo 输出

<?php
    $foo = &#39;foo&#39;;
    $bar = &#39;bar&#39;;    echo $foo . $bar;

vld 查看opcode:

number of ops:  5compiled vars:  !0 = $foo, !1 = $barline     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------   2     0  E >   ASSIGN                                                   !0, &#39;foo&#39;
   3     1        ASSIGN                                                   !1, &#39;bar&#39;
   4     2        CONCAT                                           ~2      !0, !1
         3        ECHO                                                     ~2
   5     4      > RETURN                                                   1branch: #  0; line:     2-    5; sop:     0; eop:     4; out1:  -2path #1: 0,

ZEND_CONCAT 连接 $a和$b的值,保存到临时变量~2中,然后echo 出来。这个过程中涉及要分配一块内存,用于临时变量,用完后还要释放,还需要调用拼接函数,执行拼接过程。

如果换成这样写:

<?php
    $foo = &#39;foo&#39;;
    $bar = &#39;bar&#39;;    echo $foo, $bar;

对应的opcode:

number of ops:  5compiled vars:  !0 = $foo, !1 = $bar
line     #* E I O op                           fetch          ext  return  operands-------------------------------------------------------------------------------------   2     0  E >   ASSIGN                                                   !0, &#39;foo&#39;
   3     1        ASSIGN                                                   !1, &#39;bar&#39;
   4     2        ECHO                                                     !0
         3        ECHO                                                     !1
   5     4      > RETURN                                                   1branch: #  0; line:     2-    5; sop:     0; eop:     4; out1:  -2path #1: 0,

不需要分配内存,也不需要执行拼接函数,是不是效率更好呢!想了解拼接过程,可以根据本文讲的内容,自行查找 ZEND_CONCAT 这个opcode对应的handler,做了好多事情哦。

5.2 define()和const

const关键字是从5.3开始引入的,和define有很大差别,和C语言的#define倒是含义差不多。

define() 是函数调用,有函数调用开销。

const 是关键字,直接生成opcode,属于编译期能确定的,不需要动态在执行期分配。

const 的值是死的,运行时不可以改变,所以说类似C语言的 #define,属于编译期间就确定的内容,而且对数值类型有限制。

直接看代码,对比opcode:

define例子:

<?php
    define(&#39;FOO&#39;, &#39;foo&#39;);    echo FOO;

define opcode:

number of ops:  6compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------   
2     0  E >   SEND_VAL                                                 &#39;FOO&#39;
         1        SEND_VAL                                                 &#39;foo&#39;
         2        DO_FCALL                                      2          &#39;define&#39;
   3     3        FETCH_CONSTANT                                   ~1      &#39;FOO&#39;
         4        ECHO                                                     ~1
   4     5      > RETURN                                                   1

const例子:

<?php
    const FOO = &#39;foo&#39;;    echo FOO;

const opcode:

number of ops:  4compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------  
 2     0  E >   DECLARE_CONST                                            &#39;FOO&#39;, &#39;foo&#39;
   3     1        FETCH_CONSTANT                                   ~0      &#39;FOO&#39;
         2        ECHO                                                     ~0
   4     3      > RETURN                                                   1

5.3 动态函数的代价

<?php
    function foo() { }
    foo();

对应opcode:

number of ops:  3
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   NOP
   3     1        DO_FCALL                                      0          &#39;foo&#39;
   4     2      > RETURN                                                   1

动态调用的代码:

<?php
    function foo() { }
    $a = &#39;foo&#39;;
    $a();

opcode:

number of ops:  5
compiled vars:  !0 = $a
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   NOP
   3     1        ASSIGN                                                   !0, &#39;foo&#39;
   4     2        INIT_FCALL_BY_NAME                                       !0
         3        DO_FCALL_BY_NAME                              0
   5     4      > RETURN                                                   1

可以 vim Zend/zend_vm_def.h +2630,看看INIT_FCALL_BY_NAME做的事情,代码太长,这里不列出来了。动态特性虽然方便,但一定会牺牲性能,所以使用前要平衡利弊。

5.4 类的延迟声明的代价

还是先看代码:

<?php   
 class Bar { }  
   class Foo extends Bar { }

对应opcode:

number of ops:  4
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   NOP
   3     1        NOP
         2        NOP
   4     3      > RETURN

调换声明顺序:

<?php    
class Foo extends Bar { }  
  class Bar { }

对应opcode:

number of ops:  4
compiled vars:  none
line     #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
   2     0  E >   FETCH_CLASS                                   0  :0      &#39;Bar&#39;
         1        DECLARE_INHERITED_CLASS                                  &#39;%00foo%2FUsers%2Fqisen%2Ftmp%2Fvld.php0x103d58020&#39;, &#39;foo&#39;
   3     2        NOP
   4     3      > RETURN                                                   1

如果在强语言中,后面的写法会产生编译错误,但PHP这种动态语言,会把类的声明推迟到运行时,如果你不注意,就很可能踩到这个雷。

所以在我们了解Zend VM原理后,就更应该注意少用动态特性,可有可无的时候,就一定不要用。


Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
The Continued Use of PHP: Reasons for Its EnduranceThe Continued Use of PHP: Reasons for Its EnduranceApr 19, 2025 am 12:23 AM

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python: Exploring Their Similarities and DifferencesPHP and Python: Exploring Their Similarities and DifferencesApr 19, 2025 am 12:21 AM

PHP and Python are both high-level programming languages ​​that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP and Python: Different Paradigms ExplainedPHP and Python: Different Paradigms ExplainedApr 18, 2025 am 12:26 AM

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP and Python: A Deep Dive into Their HistoryPHP and Python: A Deep Dive into Their HistoryApr 18, 2025 am 12:25 AM

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Choosing Between PHP and Python: A GuideChoosing Between PHP and Python: A GuideApr 18, 2025 am 12:24 AM

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP and Frameworks: Modernizing the LanguagePHP and Frameworks: Modernizing the LanguageApr 18, 2025 am 12:14 AM

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHP's Impact: Web Development and BeyondPHP's Impact: Web Development and BeyondApr 18, 2025 am 12:10 AM

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

How does PHP type hinting work, including scalar types, return types, union types, and nullable types?How does PHP type hinting work, including scalar types, return types, union types, and nullable types?Apr 17, 2025 am 12:25 AM

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values ​​and handle functions that may return null values.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.