首頁  >  文章  >  後端開發  >  雙引號是否過多,這就是問題所在!

雙引號是否過多,這就是問題所在!

王林
王林原創
2024-08-16 16:34:49495瀏覽

最近我又聽說 PHP 人們仍然在談論單引號與雙引號,並且使用單引號只是一種微觀優化,但如果你習慣一直使用單引號,你會節省大量的 CPU循環!

「一切都已經說過了,但還沒有被所有人說出」 – Karl Valentin

正是本著這種精神,我正在寫一篇關於 Nikita Popov 12 年前已經做過的同一主題的文章(如果您正在閱讀他的文章,您可以在這裡停止閱讀)。

毛茸茸的到底是什麼?

PHP 執行字串插值,在字串中搜尋變數的使用情況,並將其替換為所使用變數的值:

$juice = "apple";
echo "They drank some $juice juice.";
// will output: They drank some apple juice.

此功能僅限於雙引號和定界符中的字串。使用單引號(或 nowdoc)將產生不同的結果:

$juice = "apple";
echo 'They drank some $juice juice.';
// will output: They drank some $juice juice.

請注意:PHP 不會搜尋該單引號字串中的變數。所以我們可以開始在任何地方使用單引號。所以人們開始建議這樣的改變..

- $juice = "apple";
+ $juice = 'apple';

.. 因為它會更快,並且每次執行該程式碼都會節省大量CPU 週期,因為PHP 不會在單引號字串中查找變數(無論如何,該範例中不存在這些變數)並且皆大歡喜,案件結案。

案件結案了嗎?

顯然,使用單引號和雙引號是有區別的,但為了理解發生了什麼,我們需要更深入地挖掘。

儘管 PHP 是一種解釋性語言,但它使用編譯步驟,其中某些部分一起運行以獲得虛擬機器實際可以執行的內容,即操作碼。那我們要如何從 PHP 原始碼取得操作碼呢?

詞法分析器

詞法分析器掃描原始程式碼檔案並將其分解為標記。可以在 token_get_all() 函數文件中找到該意義的簡單範例。一個 PHP 原始碼只是

T_OPEN_TAG (<?php )
T_ECHO (echo)
T_WHITESPACE ( )
T_CONSTANT_ENCAPSED_STRING ("")

我們可以在這個 3v4l.org 程式碼片段中看到它的實際效果並使用它。

解析器

解析器取得這些標記並從中產生抽象語法樹。當以 JSON 表示時,上述範例的 AST 表示如下所示:

{
  "data": [
    {
      "nodeType": "Stmt_Echo",
      "attributes": {
        "startLine": 1,
        "startTokenPos": 1,
        "startFilePos": 6,
        "endLine": 1,
        "endTokenPos": 4,
        "endFilePos": 13
      },
      "exprs": [
        {
          "nodeType": "Scalar_String",
          "attributes": {
            "startLine": 1,
            "startTokenPos": 3,
            "startFilePos": 11,
            "endLine": 1,
            "endTokenPos": 3,
            "endFilePos": 12,
            "kind": 2,
            "rawValue": "\"\""
          },
          "value": ""
        }
      ]
    }
  ]
}

如果你也想玩這個,看看其他程式碼的AST 是什麼樣子,我找到了Ryan Chandler 的https://phpast.com/ 和https://php-ast-viewer.com/ ,其中兩者都顯示給定PHP 程式碼片段的AST。

編譯器

編譯器採用 AST 並建立操作碼。操作碼是虛擬機器執行的內容,如果您進行了設定並啟用了它,它也會儲存在 OPcache 中(我強烈推薦)。

要查看操作碼,我們有多個選項(也許更多,但我確實知道這三個):

  1. 使用 vulcan 邏輯轉儲器擴充。它也被納入 3v4l.org
  2. 使用 phpdbg -p script.php 轉儲操作碼
  3. 或使用 OPcache 的 opcache.opt_debug_level INI 設定使其列印出操作碼
    • 最佳化前輸出操作碼為 0x10000
    • 0x20000 的值輸出最佳化後的操作碼
$ echo '<?php echo "";' > foo.php
$ php -dopcache.opt_debug_level=0x10000 foo.php
$_main:
...
0000 ECHO string("")
0001 RETURN int(1)

假設

回到使用單引號與雙引號時節省 CPU 週期的最初想法,我想我們都同意,只有當 PHP 在運行時為每個請求評估這些字串時,這才是正確的。

運行時會發生什麼?

那麼讓我們看看 PHP 為兩個不同版本建立了哪些操作碼。

雙引號:

<?php echo "apple";
0000 ECHO string("apple")
0001 RETURN int(1)

對比單引號:

<?php echo 'apple';
0000 ECHO string("apple")
0001 RETURN int(1)

嘿等等,發生了一些奇怪的事情。這看起來一模一樣!我的微優化去哪了?

好吧,也許 ECHO 操作碼處理程序的實現會解析給定的字串,儘管沒有標記或其他東西告訴它這樣做......嗯?

讓我們試試不同的方法,看看詞法分析器對這兩種情況做了什麼:

雙引號:

T_OPEN_TAG (<?php )
T_ECHO (echo)
T_WHITESPACE ( )
T_CONSTANT_ENCAPSED_STRING ("")

對比單引號:

Line 1: T_OPEN_TAG (<?php )
Line 1: T_ECHO (echo)
Line 1: T_WHITESPACE ( )
Line 1: T_CONSTANT_ENCAPSED_STRING ('')

標記仍然區分雙引號和單引號,但是檢查AST 將為我們提供兩種情況相同的結果- 唯一的區別是Scalar_String 節點屬性中的rawValue,它仍然具有單/雙引號,但是在這兩種情況下,該值都使用雙引號。

新假設

難道字串插值其實是在編譯時完成的嗎?

讓我們來看一個稍微「複雜」的例子:

<?php
$juice="apple";
echo "juice: $juice";

此文件的令牌是:

T_OPEN_TAG (<?php)
T_VARIABLE ($juice)
T_CONSTANT_ENCAPSED_STRING ("apple")
T_WHITESPACE ()
T_ECHO (echo)
T_WHITESPACE ( )
T_ENCAPSED_AND_WHITESPACE (juice: )
T_VARIABLE ($juice)

Look at the last two tokens! String interpolation is handled in the lexer and as such is a compile time thing and has nothing to do with runtime.

Too double quote or not, that

For completeness, let's have a look at the opcodes generated by this (after optimisation, using 0x20000):

0000 ASSIGN CV0($juice) string("apple")
0001 T2 = FAST_CONCAT string("juice: ") CV0($juice)
0002 ECHO T2
0003 RETURN int(1)

This is different opcode than we had in our simple

Get to the point: should I concat or interpolate?

Let's have a look at these three different versions:

<?php
$juice = "apple";
echo "juice: $juice $juice";
echo "juice: ", $juice, " ", $juice;
echo "juice: ".$juice." ".$juice;
  • the first version is using string interpolation
  • the second is using a comma separation (which AFAIK only works with echo and not with assigning variables or anything else)
  • and the third option uses string concatenation

The first opcode assigns the string "apple" to the variable $juice:

0000 ASSIGN CV0($juice) string("apple")

The first version (string interpolation) is using a rope as the underlying data structure, which is optimised to do as little string copies as possible.

0001 T2 = ROPE_INIT 4 string("juice: ")
0002 T2 = ROPE_ADD 1 T2 CV0($juice)
0003 T2 = ROPE_ADD 2 T2 string(" ")
0004 T1 = ROPE_END 3 T2 CV0($juice)
0005 ECHO T1

The second version is the most memory effective as it does not create an intermediate string representation. Instead it does multiple calls to ECHO which is a blocking call from an I/O perspective so depending on your use case this might be a downside.

0006 ECHO string("juice: ")
0007 ECHO CV0($juice)
0008 ECHO string(" ")
0009 ECHO CV0($juice)

The third version uses CONCAT/FAST_CONCAT to create an intermediate string representation and as such might use more memory than the rope version.

0010 T1 = CONCAT string("juice: ") CV0($juice)
0011 T2 = FAST_CONCAT T1 string(" ")
0012 T1 = CONCAT T2 CV0($juice)
0013 ECHO T1

So ... what is the right thing to do here and why is it string interpolation?

String interpolation uses either a FAST_CONCAT in the case of echo "juice: $juice"; or highly optimised ROPE_* opcodes in the case of echo "juice: $juice $juice";, but most important it communicates the intent clearly and none of this has been bottle neck in any of the PHP applications I have worked with so far, so none of this actually matters.

TLDR

String interpolation is a compile time thing. Granted, without OPcache the lexer will have to check for variables used in double quoted strings on every request, even if there aren't any, waisting CPU cycles, but honestly: The problem is not the double quoted strings, but not using OPcache!

However, there is one caveat: PHP up to 4 (and I believe even including 5.0 and maybe even 5.1, I don't know) did string interpolation at runtime, so using these versions ... hmm, I guess if anyone really still uses PHP 5, the same as above applies: The problem is not the double quoted strings, but the use of an outdated PHP version.

Final advice

Update to the latest PHP version, enable OPcache and live happily ever after!

以上是雙引號是否過多,這就是問題所在!的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn