Home >Backend Development >PHP7 >Detailed explanation of PHP data structure extensions

Detailed explanation of PHP data structure extensions

coldplay.xixi
coldplay.xixiforward
2020-06-22 17:47:213211browse

Detailed explanation of PHP data structure extensions

#Disclaimer: This article is licensed under CC BY-NC-ND 4.0.

There is only one data type that represents a collection in PHP: Array. I believe everyone who is new to PHP will be confused about it. This thing should look like an Array or List in other languages, but in PHP it is everything, both a List and a Map:

<?php $a = array(1, 2, 3);
$b = array(&#39;key1&#39; => 1, 'key2' => 2);

This sounds good. Anyway, everyone uses the same data structure. Occasionally there will be some performance problems. Moreover, after upgrading to PHP7, the performance of Array has also improved. If it is not good, you can add more memory. But if we can optimize performance by introducing more convenient data structures, and writing code will be more convenient at the same time, then why not?

Recommended tutorial: "PHP Tutorial"

Disadvantages of Array

Sometimes we need to save a set (Set), but Array does not guarantee the uniqueness of elements, and array_unique has inevitable performance losses. A compromise is to use the element as the key and the value as true to achieve the function of Unique Array:

<?php $users = User::find($ids);
$res = [];
foreach ($users as $user) {
  $res[$user->id] = true;
}

PHP's Array can get null when accessing a non-existent key, and will not generate a fatal error, but there will be an E_NOTICE. This E_NOTICE will be intercepted by the function registered by set_error_handler. Obviously, this kind of unclean code and unnecessary performance overhead can be completely avoided.

<?php $req = [];
$req[&#39;user_id&#39;]; // PHP Notice:  Undefined offset

You can use array_key_exists and if else to make the code cleaner, but this will appear verbose.

Some functional methods of array are difficult to use, such as array_map, array_walk, etc., and they are also ugly to write. Of course, there is no good way to do this with native PHP. After all, PHP’s object-oriented genes are not very strong.

<?php array_map(function($user){
  return $user->is_deleted();
}, $users);
// 就是这么难看
users.map { |user| user.is_deleted? }
# ruby 的就好看多了

In some cases, the performance of using Array is very poor1, such as the following code:

<?php $a=[1,2,3,4,5,6,7];
echo $a[5];
// 6
array_unshift($a, 0);
// $a: [0,1,2,3,4,5,6,7];
echo $a[5];
// 5

It may seem like nothing, but it should be noted that Array is essentially a Map. Unshifting an element will change the key of each element. This is an $O(n)$ operation. In addition, PHP's Array saves its value (including the expanded key and its hash) in a bucket, so we need to check each bucket and update the hash. PHP actually performs the array_unshift operation internally by creating a new array, and the performance issues can be imagined2.

There are many other shortcomings.

PHP data structure plug-in

Array has been criticized, and alternatives will appear. PHP5 has spl, but the performance in some scenarios is very poor and the design is very poor1. Laravel's Collection provides a more useful Map, but after all, it is only a single data structure, and it has designed many unique interfaces for ORM operations, and its use is limited.

PHP7’s new Data Structures plug-in (ds for short) is the next excellent addition to PHP, which fully considers the needs of convenience, security and neatness. As shown below.

Detailed explanation of PHP data structure extensions

It provides 3 interface classes: Collection, Sequence, Hashable and 7 implementation classes (final class): Vector, Deque, Map, Set, Stack, Queue, PriorityQueue.

Interface

Collection is the basic interface, which defines the basic operations of a data collection (the collection here refers to Collection, not Set), such as foreach, json_encode, var_dump, etc.

<?php $sequence = new \Ds\Vector([1, 2, 3]);
json_encode($sequence);

Sequence is the basic interface of array-like data structures and defines many important and convenient methods, such as contains, map, filter, reduce, find, first, last, etc. As can be seen from the figure, Vector, Deque, Stack, and Queue all directly or indirectly implement this interface.

<?php $sequence = new \Ds\Vector([1, 2, 3]);

print_r($sequence->map(function($value) { return $value * 2; }));
print_r($sequence);
?>

Hashable looks isolated in the diagram, but it is important for Map and Set. If an Object implements Hashable, it can be used as the key of Map and the element of Set. In this way, Map and Set can be used as conveniently as Java.

Implementation class

Vector should be one of the most commonly used data structures. You can think of it as Ruby's Array or Python's List. The index of its element's value is its index in the buffer, so it is very efficient. You can use it as long as you need to use an array and do not need insert, remove, shift and unshift.

Deque([dek]) is a double-ended queue, which adds a head pointer to Vector, so shift and unshift are also $O(1)$ complex. But the performance loss is not much, so there is also discussion on whether only one Deque is enough and no Vector is needed (discussion) 3.

Stack 栈,嗯没什么好说的,它继承自 Collection,但内部使用 Vector 实现。这样做的好处是实现方便,且同时可以屏蔽不需要的和不应该出现的方法。

Queue 队列,内部使用 Deque 实现。

PriorityQueue,最大堆实现。

Map。以前使用 Array 来实现 map 的地方,改用 Map 更好。二者性能几乎一致,但 Map 对内存的管理更好。而且,Map 的语法要更加友好。

<?php
$req = [];
$req[&#39;user_id&#39;]; // PHP Notice:  Undefined offset

$req = new \Ds\Map(["a" => 1, "b" => 2, "c" => 3]);
$req->get(&#39;user_id&#39;);// OutOfBoundsException
$req->get(&#39;user_id&#39;, 0); // 0 是默认值
// 即可以方便的指定默认值,也可以选择抛出异常。不用 array,不会产生 E_NOTICE

$req->keys();

$req->map(function($key, $value) { return $value * 2; });
   

不仅如此,只要 object 继承了 Hashable,Map 还允许使用 object 作为 key。

<?php
class Photo implements \Ds\Hashable {
    
    public function __construct($id) {
        $this->id = $id;
    }
    
    public function hash() {
        return $this->id;
    }

    public function equals($obj): bool {
        return $this->id === $obj->id;
    }
}

$p1 = new Photo(1);
$p2 = new Photo(2);

$map = new Ds\Map();
$map->put($p1, 1);
$map->put($p2, 2);
   

Set 集合是一种元素唯一的数据结构。和 array_unique 相比性能有很大提升,而且用法也更加优雅1

<?php
$set = new Ds\Set();
$set->add($p1);
$set->add($p2);
   
  1. php ds 插件性能测试 ↩ ↩2  ↩3

  2. 当然,这一点可能稍嫌牵强,毕竟即使是数据量很大的情况下,array_unshift 的耗时也没有那么大 ↩

  3. github 上还在讨论可以增加一个不可变类型 Tuple,以及取消 Vector 直接使用 Deque,讨论地址和 2.0API 计划 ↩

The above is the detailed content of Detailed explanation of PHP data structure extensions. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete