Home > Article > Backend Development > PHP performance optimization tool: generator
If you are working in Python or other languages, you should be familiar with generators. However, many PHP developers may not be aware of the functionality of generators, either because generators were introduced in PHP 5.5.0 or because their functionality is not obvious. But the generator function is really useful.
Advantages
If I talk about the concept directly, I guess you will still be confused after listening to it, so let’s talk about the advantages first, maybe it can arouse your interest. So what are the advantages of generators, as follows:
● Generators will have a great impact on the performance of PHP applications
● Save a lot of memory when PHP code is running
● More suitable for calculating large amounts of data
So, how are these magical functions achieved? Let's give an example first.
Concept introduction
First of all, let’s put down the burden of the generator concept and look at a simple PHP function:
function createRange($number){ $data = []; for($i=0;$i<$number;$i++){ $data[] = time(); } return $data; }
This is a very common PHP function, we often use it when processing some arrays. The code here is also very simple:
1. We create a function.
2. The function contains a for loop. We put the current time into $data in a loop
3. After the for loop is executed, $data is returned.
It’s not over yet, let’s continue. Let’s write another function and print out the return value of this function in a loop:
$result = createRange(10); // 这里调用上面我们创建的函数 foreach($result as $value){ sleep(1);//这里停顿1秒,我们后续有用 echo $value.'<br />'; }
Let’s take a look at the running results in the browser:
This is very Perfect, no issues whatsoever. (Of course you can’t see the effect of sleep(1))
Think about a question
We noticed that when calling the function createRange, the value passed to $number is 10, a very small number. Suppose, now pass a value of 10000000 (10 million).
Then, in the function createRange, the for loop needs to be executed 10 million times. And 10 million values are placed in $data, and the $data array is placed in memory. Therefore, a lot of memory will be occupied when calling functions.
Here, the generator can show its talents.
Create generator
We modify the code directly, please pay attention:
function createRange($number){ for($i=0;$i<$number;$i++){ yield time(); } }
Look at this code that is very similar to just now, we delete it The array $data is returned, and nothing is returned. Instead, a keyword yield is used before time()
Use the generator
Let’s run it again The second piece of code:
$result = createRange(10); // 这里调用上面我们创建的函数 foreach($result as $value){ sleep(1); echo $value.'<br />'; }
We miraculously discovered that the output value is different from the first time without using the generator. The values (timestamps) here are separated by 1 second.
The one second interval here is actually the consequence of sleep(1). But why is there no gap the first time? That's because:
● When the generator is not used: the for loop result in the createRange function is quickly placed in $data and returned immediately. Therefore, the foreach loop is a fixed array.
● When using a generator: the value of createRange is not generated quickly at once, but depends on the foreach loop. foreach loops once and for is executed once.
At this point, you should have some idea about the generator.
In-depth understanding of the generator
Code analysis
Let’s analyze the code just now.
function createRange($number){ for($i=0;$i<$number;$i++){ yield time(); } } $result = createRange(10); // 这里调用上面我们创建的函数 foreach($result as $value){ sleep(1); echo $value.'<br />'; }
Let’s restore the code execution process.
1. First call the createRange function, passing in parameter 10, but the for value is executed once and then stops, and tells foreach the value that can be used for the first loop.
2. foreach starts to loop over $result, first sleep(1), and then starts to use a value given by for to perform output.
3.foreach prepares for the second loop. Before starting the second loop, it requests the for loop again.
4. The for loop is executed again, and the generated timestamp is told to foreach.
5. foreach gets the second value and outputs it. Due to sleep(1) in foreach, the for loop is delayed by 1 second to generate the current time
Therefore, during the entire code execution, there is always only one record value participating in the loop, and there is only one piece of information in the memory.
No matter how big the $number is initially passed in, since not all result sets are generated immediately, the memory is always a loop of values.
Conceptual understanding
At this point, you should have a rough understanding of what a generator is. Let’s talk about the generator principle below.
First of all, clarify a concept: the yield keyword of the generator is not a return value. Its professional term is called output value. It just generates a value.
So what is the foreach loop in the code? In fact, when PHP uses a generator, it will return an object of the Generator class. foreach can iterate the object. For each iteration, PHP will calculate the value that needs to be iterated next through the Generator instance. This way foreach will know the value that needs to be iterated next.
而且,在运行中for循环执行后,会立即停止。等待foreach下次循环时候再次和for索要下次的值的时候,for循环才会再执行一次,然后立即再次停止。直到不满足条件不执行结束。
实际开发应用
很多PHP开发者不了解生成器,其实主要是不了解应用领域。那么,生成器在实际开发中有哪些应用?
读取超大文件
PHP开发很多时候都要读取大文件,比如csv文件、text文件,或者一些日志文件。这些文件如果很大,比如5个G。这时,直接一次性把所有的内容读取到内存中计算不太现实。
这里生成器就可以派上用场啦。简单看个例子:读取text文件
我们创建一个text文本文档,并在其中输入几行文字,示范读取。
<?php header("content-type:text/html;charset=utf-8"); function readTxt() { # code... $handle = fopen("./test.txt", 'rb'); while (feof($handle)===false) { # code... yield fgets($handle); } fclose($handle); } foreach (readTxt() as $key => $value) { # code... echo $value.'<br />'; }
通过上图的输出结果我们可以看出代码完全正常。
但是,背后的代码执行规则却一点儿也不一样。使用生成器读取文件,第一次读取了第一行,第二次读取了第二行,以此类推,每次被加载到内存中的文字只有一行,大大的减小了内存的使用。
这样,即使读取上G的文本也不用担心,完全可以像读取很小文件一样编写代码。
批量更新数据库表字段
/** * @desc: 方法描述 * @param int $count 数组个数(需要循环多少次) * @param int $limit 数组大小 * @return \Generator */ public function getAddressContent($count = 1, $limit = 20000) { for ($i = 0; $i < ceil($count / $limit); $i++) { $result = StudentModel::where('id','<','67265') ->limit($i * $limit, $limit) ->order('id desc') ->select()->toArray(); yield $result; } } /** * @desc: 修改数据库 省份、城市 * @throws Exception */ public function idCard() { $count = 200000000; // 需要更新的数据 foreach ($this->getAddressContent($count) as $key=>$lists) { foreach ($lists as $k => $v) { $peopleIdentity = new Identity($v['idcard']); $peopleRegion = $peopleIdentity->region(); if($peopleRegion->code() != 0 ){ $res = StudentModel::where('id', $v['id'])->update([ 'birthday' => $peopleIdentity->birthday()??'', 'province' => $peopleRegion->province()??'', 'city' => $peopleRegion->city()??'', 'county' => $peopleRegion->county()??'', ]); Log::debug('更新结果 [' . $v['id'] . ']: ' . json_encode($res)); } } } echo "success"; }
使用命令行执行
php id_card.php
打印日志
CPU和内存消耗
更多php知识,请访问php教程!
The above is the detailed content of PHP performance optimization tool: generator. For more information, please follow other related articles on the PHP Chinese website!