無論您曾經處理過本地文件、HTTP 請求還是壓縮文件,您都處理過流,但是…您真的了解它們嗎?
我認為這是 PHP 中最容易被誤解的概念之一,因此我看到了很多由於缺乏一些基礎知識而引入的錯誤。
在本文中,我將嘗試解釋流到底是什麼以及如何使用它們。我們將看到許多用於處理流的函數以及大量示例,但我無意以任何方式“重新記錄”所有這些函數。
在了解什麼是流之前,我們首先需要接觸資源。
資源只是對外部資源的參考或指針,例如檔案、資料庫、網路或 SSH 連線。
有多種類型的資源,例如curl - 由curl_init() 創建,進程,由proc_open 創建,流,由fopen()、opendir 等函數創建。
流是 PHP 概括具有共同行為的資源類型的方式,也就是說,資源可以線性讀取和寫入,就像盒式磁帶一樣(該死,我老了)。流的一些範例包括文件資源、HTTP 回應主體和壓縮文件,僅舉幾例。
流非常有用,因為它們使我們能夠處理大小從幾個位元組到幾個 GB 的資源,例如,嘗試完全讀取它們會耗盡我們的可用記憶體。
fopen( string $filename, string $mode, bool $use_include_path = false, ?resource $context = null ): resource|false
fopen 開啟一個 檔案 或 網路資源[1],取決於提供給第一個參數的路徑。如前所述,該資源是流類型:
$fileStream = fopen('/tmp/test', 'w'); echo get_resource_type($fileStream); // 'stream'
如果$filename以scheme://形式提供,則假定它是一個URL,PHP將嘗試查找與路徑匹配的支援的協議處理程序/包裝器,例如file:// - 來處理本地文件,http :// - 用於處理遠端HTTP/S 資源,ssh2:// - 處理SSH 連接或php:// - 允許我們存取PHP 自己的輸入和輸出流,例如php://stdin, php://stdout和php://stderr。
$mode 定義了你需要對流的存取類型,即是否只需要讀取存取、只需要寫、讀寫、從流的開頭或末尾讀/寫等等。
該模式也取決於您正在處理的資源類型。例如:
$fileStream = fopen('/tmp/test', 'w'); $networkStream = fopen('https://google.com', 'r');
例如,使用包裝器 https:// 開啟可寫流不起作用:
fopen('https://google.com', 'w'); // Failed to open stream: HTTP wrapper does not support writeable connections
[1] 只有在 php.ini 上啟用了allow_url_fopen 時,才能將 fopen 與網路或遠端資源結合使用。有關更多信息,請查看文檔。
現在我們有了流資源,我們可以用它們做什麼?
fwrite(resource $stream, string $data, ?int $length = null): int|false
fwrite 讓我們能夠將提供給 $data 的內容寫入流中。如果提供了 $length,則它僅寫入給定的提供的位元組數。讓我們來看一個例子:
$fileStream = fopen('/tmp/test', 'w'); fwrite($fileStream, "The quick brown fox jumps over the lazy dog", 10);
在這個例子中,由於我們提供了 $length = 10,所以只寫了部分內容 - “The Quick” - 忽略其餘部分。
請注意,我們使用 $mode = 'w' 開啟檔案流,這使我們能夠將內容寫入檔案。相反,如果我們使用 $mode = 'r' 開啟文件,我們將收到一則訊息,例如 fwrite(): Write of 8192 bytes failed with errno=9 Bad file detector。
讓我們來看另一個例子,現在將整個內容寫入檔案流:
$fileStream = fopen('/tmp/test', 'w'); fwrite($fileStream, "The quick brown fox jumps over the lazy dog");
現在,由於我們還沒有提供 $length,所以整個內容都已寫入文件中。
寫入流會將讀取/寫入指標的位置移到序列的末端。在這種情況下,寫入流中的字串有 44 個字符,因此,指針現在的位置應該是 43。
除了寫入檔案之外,fwrite 還可以寫入其他類型的流,例如套接字。從文件中提取的範例:
$sock = fsockopen("ssl://secure.example.com", 443, $errno, $errstr, 30); if (!$sock) die("$errstr ($errno)\n"); $data = "foo=" . urlencode("Value for Foo") . "&bar=" . urlencode("Value for Bar"); fwrite($sock, "POST /form_action.php HTTP/1.0\r\n"); fwrite($sock, "Host: secure.example.com\r\n"); fwrite($sock, "Content-type: application/x-www-form-urlencoded\r\n"); fwrite($sock, "Content-length: " . strlen($data) . "\r\n"); fwrite($sock, "Accept: */*\r\n"); fwrite($sock, "\r\n"); fwrite($sock, $data); $headers = ""; while ($str = trim(fgets($sock, 4096))) $headers .= "$str\n"; echo "\n"; $body = ""; while (!feof($sock)) $body .= fgets($sock, 4096); fclose($sock);
fread(resource $stream, int $length): string|false
使用 fread,您可以從流中讀取 最多 $length 個位元組,從目前讀取指標開始。它是二進制安全的,並且可以與本地和網路資源一起使用,正如我們將在範例中看到的那樣。
連續呼叫fread會讀取一個chunk,然後將讀取指標移到該chunk的末端。例如,考慮上一個範例中寫入的檔案:
# Content: "The quick brown fox jumps over the lazy dog" $fileStream = fopen('/tmp/test', 'r'); echo fread($fileStream, 10) . PHP_EOL; // 'The quick ' echo ftell($fileStream); // 10 echo fread($fileStream, 10) . PHP_EOL; // 'brown fox ' echo ftell($fileStream); // 20
我們很快就會回來 ftell,但它所做的只是返回讀取指標的當前位置。
一旦出現以下情況之一,讀取就會停止(返回 false)(從文檔複製,稍後你會明白):
- length bytes have been read
- EOF (end of file) is reached
- a packet becomes available or the socket timeout occurs (for network streams)
- if the stream is read buffered and it does not represent a plain file, at most one read of up to a number of bytes equal to the chunk size (usually 8192) is made; depending on the previously buffered data, the size of the returned data may be larger than the chunk size.
I don't know if you had the same felling, but this last part is pretty cryptic, so let's break it down.
Stream reads and writes can be buffered, that is, the content may be stored internally. It is possible to disable/enable the buffering, as well as set their sizes using stream_set_read_buffer and stream-set-write-buffer, but according to this comment on the PHP doc's Github, the description of these functions can be misleading.
This is where things get interesting, as this part of the documentation is really obscure. As per the comment, setting stream_set_read_buffer($stream, 0) would disable the read buffering, whereas stream_set_read_buffer($stream, 1) or stream_set_read_buffer($stream, 42) would simply enable it, ignoring its size (depending on the stream wrapper, which can override this default behaviour).
The chunk size is usually 8192 bytes or 8 KiB, as we will confirm in a bit. We can change this value using stream_set_chunk_size. Let's see it in action:
$f = fopen('https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-standard-3.20.2-x86_64.iso', 'rb'); $previousPos = 0; $chunkSize = 1024; $i = 1; while ($chunk = fread($f, $chunkSize)) { $bytesRead = (ftell($f) - $previousPos); $previousPos = ftell($f); echo "Iteration: {$i}. Bytes read: {$bytesRead}" . PHP_EOL; $i++; }
Output:
Iteration: 1. Bytes read: 1024 Iteration: 2. Bytes read: 1024 Iteration: 3. Bytes read: 1024 ... Iteration: 214016. Bytes read: 1024 Iteration: 214017. Bytes read: 169
What happened in this case was clear:
Now let's increase considerably the length provided to fread to 1 MiB and see what happens:
$f = fopen('https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-standard-3.20.2-x86_64.iso', 'rb'); $previousPos = 0; $chunkSize = 1048576; // 1 MiB $i = 1; while ($chunk = fread($f, $chunkSize)) { $bytesRead = (ftell($f) - $previousPos); $previousPos = ftell($f); echo "Iteration: {$i}. Bytes read: {$bytesRead}" . PHP_EOL; $i++; }
Output:
Iteration: 1. Bytes read: 1378 Iteration: 2. Bytes read: 1378 Iteration: 3. Bytes read: 1378 ... Iteration: 24. Bytes read: 1074 Iteration: 25. Bytes read: 8192 Iteration: 26. Bytes read: 8192 ... Iteration: 26777. Bytes read: 8192 Iteration: 26778. Bytes read: 8192 Iteration: 26779. Bytes read: 293
So, even though we tried to read 1 MiB using fread, it read up to 8192 bytes - same value that the docs said it would. Interesting. Let's see another experiment:
$f = fopen('https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-standard-3.20.2-x86_64.iso', 'rb'); $previousPos = 0; $chunkSize = 1048576; // 1 MiB $i = 1; stream_set_chunk_size($f, $chunkSize); // Just added this line while ($chunk = fread($f, $chunkSize)) { $bytesRead = (ftell($f) - $previousPos); $previousPos = ftell($f); echo "Iteration: {$i}. Bytes read: {$bytesRead}" . PHP_EOL; $i++; }
And the output:
Iteration: 1. Bytes read: 1378 Iteration: 2. Bytes read: 1378 Iteration: 3. Bytes read: 1378 ... Iteration: 12. Bytes read: 533 Iteration: 13. Bytes read: 16384 Iteration: 14. Bytes read: 16384 ... Iteration: 13386. Bytes read: 16384 Iteration: 13387. Bytes read: 16384 Iteration: 13388. Bytes read: 13626
Notice that now fread read up to 16 KiB - not even close to what we wanted, but we've seen that stream_set_chunk_size did work, but there are some hard limits, that I suppose that depends also on the wrapper. Let's put that in practice with another experiment, using a local file this time:
$f = fopen('alpine-standard-3.20.2-x86_64.iso', 'rb'); $previousPos = 0; $chunkSize = 1048576; // 1 MiB $i = 1; while ($chunk = fread($f, $chunkSize)) { $bytesRead = (ftell($f) - $previousPos); $previousPos = ftell($f); echo "Iteration: {$i}. Bytes read: {$bytesRead}" . PHP_EOL; $i++; }
Output:
Iteration: 1. Bytes read: 1048576 Iteration: 2. Bytes read: 1048576 ... Iteration: 208. Bytes read: 1048576 Iteration: 209. Bytes read: 1048576
Aha! So using the local file handler we were able to fread 1 MiB as we wanted, and we did not even need to increase the buffer/chunk size with stream_set_chunk_size.
I think that now the description is less cryptic, at least. Let's read it again (with some interventions):
if the stream is read buffered ...
and it does not represent a plain file (that is, local, not a network resource), ...
at most one read of up to a number of bytes equal to the chunk size (usually 8192) is made (and in our experiments we could confirm that this is true, at least one read of the chunk size was made); ...
depending on the previously buffered data, the size of the returned data may be larger than the chunk size (we did not experience that, but I assume it may happen depending on the wrapper).
There is definitely some room to play here, but I will challenge you. What would happen if you disable the buffers while reading a file? And a network resource? What if you write into a file?
ftell(resource $stream): int|false
ftell returns the position of the read/write pointer (or null when the resource is not valid).
# Content: "The quick brown fox jumps over the lazy dog" $fileStream = fopen('/tmp/test', 'r'); fread($fileStream, 10); # "The quick " echo ftell($fileStream); 10
stream_get_meta_data(resource $stream): array
stream_get_meta_data returns information about the stream in form of an array. Let's see an example:
# Content: "The quick brown fox jumps over the lazy dog" $fileStream = fopen('/tmp/test', 'r'); var_dump(stream_get_meta_data($fileStream)):
The previous example would return in something like this:
array(9) { ["timed_out"]=> bool(false) ["blocked"]=> bool(true) ["eof"]=> bool(false) ["wrapper_type"]=> string(9) "plainfile" ["stream_type"]=> string(5) "STDIO" ["mode"]=> string(1) "r" ["unread_bytes"]=> int(0) ["seekable"]=> bool(true) ["uri"]=> string(16) "file:///tmp/test" }
This function's documentation is pretty honest describing each value ;)
fseek(resource $stream, int $offset, int $whence = SEEK_SET): int
fseek sets the read/write pointer on the opened stream to the value provided to $offset.
The position will be updated based on $whence:
Using SEEK_END we can provide a negative value to $offset and go backwards from EOF. Its return value can be used to assess if the position has been set successfully (0) or has failed (-1).
Let's see some examples:
# Content: "The quick brown fox jumps over the lazy dog\n" $fileStream = fopen('/tmp/test', 'r+'); fseek($fileStream, 4, SEEK_SET); echo fread($fileStream, 5); // 'quick' echo ftell($fileStream); // 9 fseek($fileStream, 7, SEEK_CUR); echo ftell($fileStream); // 16, that is, 9 + 7 echo fread($fileStream, 3); // 'fox' fseek($fileStream, 5, SEEK_END); // Sets the position past the End Of File echo ftell($fileStream); // 49, that is, EOF (at 44th position) + 5 echo fread($fileStream, 3); // '' echo ftell($fileStream); // 49, nothing to read, so read/write pointer hasn't changed fwrite($fileStream, 'foo'); ftell($fileStream); // 52, that is, previous position + 3 fseek($fileStream, -3, SEEK_END); ftell($fileStream); // 49, that is, 52 - 3 echo fread($fileStream, 3); // 'foo'
As we've seen in this example, it is possible we seek past the End Of File and even read in an unwritten area (which returns 0 bytes), but some types of streams do not support it.
An important consideration is that not all streams can be seeked, for instance, you cannot fseek a remote resource:
$f = fopen('https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-standard-3.20.2-x86_64.iso', 'rb'); fseek($f, 10); WARNING fseek(): Stream does not support seeking
This obviously makes total sense, as we cannot "fast-forward" and set a position on a remote resource. The stream in this case is only read sequentially, like a cassette tape.
We can determine if the stream is seekable or not via the seekable value returned by stream_get_meta_data that we've seen before.
rewind(resource $stream): bool
This is a pure analogy of rewinding a videotape before returning it to video store. As expected, rewind sets the position of the read/write pointer to 0, which is basically the same as calling fseek with $offset 0.
The same considerations we've seen for fseek applies for rewind, that is:
So far we've been working directly with resources. file_get_contents is a bit different, as it accepts the file path and returns the whole file content as a string, that is, it implicitly opens the resource.
file_get_contents( string $filename, bool $use_include_path = false, ?resource $context = null, int $offset = 0, ?int $length = null ): string|false
Similar to fread, file_get_contents can work on local and remote resources, depending on the $filename we provide:
# Content: "The quick brown fox jumps over the lazy dog" echo file_get_contents('/tmp/test'); // "The quick brown fox jumps over the lazy dog\n" echo file_get_contents('https://www.php.net/images/logos/php-logo.svg'); // "<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 -1 100 50">\n ..."
With $offset we can set the starting point to read the content, whereas with length we can get a given amount of bytes.
# Content: "The quick brown fox jumps over the lazy dog" echo file_get_contents('/tmp/test', offset: 16, size: 3); // 'fox'
offset also accepts negative values, which counts from the end of the stream.
# Content: "The quick brown fox jumps over the lazy dog" echo file_get_contents('/tmp/test', offset: -4, size: 3); // 'dog'
Notice that the same rules that govern fseek are also applied for $offset, that is - you cannot set an $offset while reading remote files, as the function would be basically fseek the stream, and we've seen that it does not work well.
The parameter context makes file_get_contents really flexible, enabling us set, for example:
We create a context using stream_context_create, example:
$context = stream_context_create(['http' => ['method' => "POST"]]); file_get_contents('https://a-valid-resource.xyz', context: $context);
You can find the list of options you can provide to stream_context_create in this page.
$networkResource = fopen('https://releases.ubuntu.com/24.04/ubuntu-24.04-desktop-amd64.iso', 'r'); while ($chunk = fread($networkResource, 1024)) { doSomething($chunk); }
The list of functions that we can use to read local or remote contents is lengthy, and each function can be seen as a tool in your tool belt, suitable for a specific purpose.
According to the docs, file_get_contents is the preferred way of reading contents of a file into a string, but, is it appropriate for all purposes?
Ask yourself these (and other questions), make some performance benchmark tests and select the function that suits your needs the most.
PSR defines the StreamInterface, which libraries such as Guzzle use to represent request and response bodies. When you send a request, the body is an instance of StreamInterface. Let's see an example, extracted from the Guzzle docs:
$client = new \GuzzleHttp\Client(); $response = $client->request('GET', 'http://httpbin.org/get'); $body = $response->getBody(); $body->seek(0); $body->read(1024);
I suppose that the methods available on $body look familiar for you now :D
StreamInterface implements methods that resemble a lot the functions we've just seen, such as:
Last but not least, we can use GuzzleHttp\Psr7\Utils::streamFor to create streams from strings, resources opened with fopen and instances of StreamInterface:
use GuzzleHttp\Psr7; $stream = Psr7\Utils::streamFor('string data'); echo $stream; // string data echo $stream->read(3); // str echo $stream->getContents(); // ing data var_export($stream->eof()); // true var_export($stream->tell()); // 11
In this article we've seen what streams really are, learned how to create them, read from them, write to them, manipulate their pointers as well as clarified some obscured parts regarding read a write buffers.
If I did a good job, some of the doubts you might have had regarding streams are now a little bit clearer and, from now on, you'll write code more confidently, as you know what you are doing.
Should you noticed any errors, inaccuracies or there is any topic that is still unclear, let me know in the comments and I'd be glad to try to help.
以上是PHP 中的流的詳細內容。更多資訊請關注PHP中文網其他相關文章!