Home  >  Article  >  Backend Development  >  Python standard library collections usage tutorial

Python standard library collections usage tutorial

黄舟
黄舟Original
2017-02-04 16:49:131459browse

Introduction

Python provides us with 4 basic data structures: list, tuple, dict, set, but when dealing with large amounts of data, these four data structures are obviously too simple. For example, the insertion efficiency of list as a one-way linked list will be relatively low in some situations. Sometimes we also need to maintain a An ordered dict. So at this time we have to use the collections package provided by the Python standard library. It provides a number of useful collection classes. Being proficient in these collection classes will not only allow us to make the code we write more Pythonic, but also improve How efficiently our programs run.

Usage of defaultdict

defaultdict(default_factory) adds default_factory on top of the ordinary dict (dictionary), so that the corresponding key (key) will be automatically generated when it does not exist Type value (value), the default_factory parameter can be specified as a list, Set, int and other legal types.

example1

>>> from collections import defaultdict
>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]

We now have a list like the above. Although we have 6 sets of data, after careful observation, we found that we actually only have two colors. (color), but each color corresponds to multiple values. Now we want to convert this list into a dict (dictionary). The key (key) of this dict corresponds to a color, and the value (value) of the dict is set to a list to store multiple values ​​​​corresponding to the color. We can use defaultdict(list) to solve this problem.

# 
d可以看作一个dict(字典),dict的value是一个list(列表)
>>> d = defaultdict(list)
>>> for k, v in s:
...     d[k].append(v)
...
>>> d
defaultdict(<class &#39;list&#39;>, {&#39;blue&#39;: [2, 4, 4], &#39;red&#39;: [1, 3, 1]})

example2

There are some imperfections in the above example, such as {'blue': [2, 4, 4], 'red': [1, 3, 1]} In this defaultdict, the blue color contains two 4s, and the red color contains two 1s. However, we do not want to contain duplicate elements. At this time, we can consider using defaultdict(set) to solve this problem. The difference between set (collection) and list (list) is that the same elements are not allowed to exist in set.

>>> d = defaultdict(set)
>>> for k, v in s:
...     d[k].add(v)
...
>>> d
defaultdict(<class &#39;set&#39;>, {&#39;blue&#39;: {2, 4}, &#39;red&#39;: {1, 3}})

example3

>>> s = 
&#39;hello world&#39;

By using the form of defaultdict(int) we count the number of occurrences of each character in a string.

>>> d = defaultdict(int)
>>> for k in s:
...     d[k] += 1
...
>>> d
defaultdict(<class &#39;int&#39;>, {&#39;o&#39;: 2, &#39;h&#39;: 1, &#39;w&#39;: 1, &#39;l&#39;: 3, &#39; &#39;: 1, &#39;d&#39;: 1, &#39;e&#39;: 1, &#39;r&#39;: 1})

Usage of OrderedDict

We know that the default dict (dictionary) is unordered, but in some cases we need to keep the dict ordered At this time, you can use OrderedDict, which is a subclass of dict, but it maintains the ordered type of dict on the basis of dict. Let's take a look at how to use it.

example1

>>> from collections import OrderedDict
# 
无序的dict
>>> d = {&#39;banana&#39;: 3, &#39;apple&#39;: 4, &#39;pear&#39;: 1, &#39;orange&#39;: 2}

This is an unordered dict (dictionary). Now we can use OrderedDict to make this dict ordered.

# 
将d按照key来排序
>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([(&#39;apple&#39;, 4), (&#39;banana&#39;, 3), (&#39;orange&#39;, 2), (&#39;pear&#39;, 1)])
# 
将d按照value来排序
>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([(&#39;pear&#39;, 1), (&#39;orange&#39;, 2), (&#39;banana&#39;, 3), (&#39;apple&#39;, 4)])
# 
将d按照key的长度来排序
>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
OrderedDict([(&#39;pear&#39;, 1), (&#39;apple&#39;, 4), (&#39;orange&#39;, 2), (&#39;banana&#39;, 3)])

example2

Using the popitem(last=True) method allows us to delete the key-value in the dict in LIFO (first in, last out) order , that is, delete the last inserted key-value pair. If last=False, delete the key-value in the dict according to FIFO (first in, first out).

>>> d = {&#39;banana&#39;: 3, &#39;apple&#39;: 4, &#39;pear&#39;: 1, &#39;orange&#39;: 2}
# 
将d按照key来排序
>>> d = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
>>> d
OrderedDict([(&#39;apple&#39;, 4), (&#39;banana&#39;, 3), (&#39;orange&#39;, 2), (&#39;pear&#39;, 1)])
# 
使用popitem()方法来移除最后一个key-value对
>>> d.popitem()
(&#39;pear&#39;, 1)
# 
使用popitem(last=False)来移除第一个key-value对
>>> d.popitem(last=False)
(&#39;apple&#39;, 4)

example3

Use move_to_end(key, last=True) to change the key-value order of the ordered OrderedDict object. Through this method, we can insert any key-value in the ordered OrderedDict object to the beginning or end of the dictionary.

>>> d = OrderedDict.fromkeys(&#39;abcde&#39;)
>>> d
OrderedDict([(&#39;a&#39;, None), (&#39;b&#39;, None), (&#39;c&#39;, None), (&#39;d&#39;, None), (&#39;e&#39;, None)])
# 
将key为b的key-value对移动到dict的最后
>>> d.move_to_end(&#39;b&#39;)
>>> d
OrderedDict([(&#39;a&#39;, None), (&#39;c&#39;, None), (&#39;d&#39;, None), (&#39;e&#39;, None), (&#39;b&#39;, None)])
>>> &#39;&#39;.join(d.keys())
&#39;acdeb&#39;
# 
将key为b的key-value对移动到dict的最前面
>>> d.move_to_end(&#39;b&#39;, last=False)
>>> &#39;&#39;.join(d.keys())
&#39;bacde&#39;

The use of deque

#The advantage of list storing data is that searching for elements by index will be fast, but inserting and deleting elements is very slow. Because it is a singly linked list data structure. Deque is a two-way list for efficient implementation of insertion and deletion operations. It is suitable for queues and stacks and is thread-safe.

List only provides append and pop methods to insert/delete elements from the end of the list, but deque adds appendleft/popleft to allow us to efficiently insert/delete elements at the beginning of the element. Moreover, the algorithm complexity of using deque to add (append) or pop (pop) elements at both ends of the queue is about O(1), but for the operation of the list object to change the list length and data position, for example The complexity of pop(0) and insert(0, v) operations is as high as O(n). Since the operation of deque is basically the same as that of list, it will not be repeated here.

Use of ChainMap

ChainMap is used to combine multiple dicts (dictionaries) into a list (just a metaphor), which can be understood as merging multiple dictionaries. But it is different from update and more efficient.

>>> from collections import ChainMap
>>> a = {&#39;a&#39;: &#39;A&#39;, &#39;c&#39;: &#39;C&#39;}
>>> b = {&#39;b&#39;: &#39;B&#39;, &#39;c&#39;: &#39;D&#39;}
>>> m = ChainMap(a, b)
# 
构造一个ChainMap对象
>>> m
ChainMap({&#39;a&#39;: &#39;A&#39;, &#39;c&#39;: &#39;C&#39;}, {&#39;b&#39;: &#39;B&#39;, &#39;c&#39;: &#39;D&#39;})
>>> m[&#39;a&#39;]
&#39;A&#39;
>>> m[&#39;b&#39;]
&#39;B&#39;
# 
将m变成一个list
>>> m.maps
[{&#39;a&#39;: &#39;A&#39;, &#39;c&#39;: &#39;C&#39;}, {&#39;b&#39;: &#39;B&#39;, &#39;c&#39;: &#39;D&#39;}]

# 
更新a中的值也会对ChainMap对象造成影响
>>> a[&#39;c&#39;] = &#39;E&#39;
>>> m[&#39;c&#39;]
&#39;E&#39;
# 
从m复制一个ChainMap对象,更新这个复制的对象并不会对m造成影响
>>> m2 = m.new_child()
>>> m2[&#39;c&#39;] = &#39;f&#39;
>>> m[&#39;c&#39;]
&#39;E&#39;
>>> a[&#39;c&#39;]
&#39;E&#39;
>>> m2.parents
ChainMap({&#39;a&#39;: &#39;A&#39;, &#39;c&#39;: &#39;C&#39;}, {&#39;b&#39;: &#39;B&#39;, &#39;c&#39;: &#39;D&#39;})

Usage of Counter

example1

Counter is also a subclass of dict, it is An unordered container can be regarded as a counter, used to count the number of related elements.

>>> from collections import Counter
>>> cnt = Counter()
# 
统计列表中元素出现的个数
>>> for word in [&#39;red&#39;, &#39;blue&#39;, &#39;red&#39;, &#39;green&#39;, &#39;blue&#39;, &#39;blue&#39;]:
...  cnt[word] += 1
...
>>> cnt
Counter({&#39;blue&#39;: 3, &#39;red&#39;: 2, &#39;green&#39;: 1})
# 
统计字符串中元素出现的个数
>>> cnt = Counter()
>>> for ch in &#39;hello&#39;:
...     cnt[ch] = cnt[ch] + 1
...
>>> cnt
Counter({&#39;l&#39;: 2, &#39;o&#39;: 1, &#39;h&#39;: 1, &#39;e&#39;: 1})

example2

Use the elements() method to return an iterator (iterator) according to the number of occurrences of the element. The elements are returned in any order. If the count of elements is less than 1, will ignore it.

>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> c
Counter({&#39;a&#39;: 4, &#39;b&#39;: 2, &#39;c&#39;: 0, &#39;d&#39;: -2})
>>> c.elements()
<itertools.chain object at 0x7fb0a069ccf8>
>>> next(c)
&#39;a&#39;
# 
排序
>>> sorted(c.elements())
[&#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;b&#39;, &#39;b&#39;]

Use most_common(n) to return a list, which contains the top n elements that appear in the Counter object.

>>> c = Counter(&#39;abracadabra&#39;)
>>> c
Counter({&#39;a&#39;: 5, &#39;b&#39;: 2, &#39;r&#39;: 2, &#39;d&#39;: 1, &#39;c&#39;: 1})
>>> c.most_common(3)
[(&#39;a&#39;, 5), (&#39;b&#39;, 2), (&#39;r&#39;, 2)]

Usage of namedtuple

Use namedtuple(typename, field_names) to name the elements in the tuple to make the program more readable.

>>> from collections import namedtuple
>>> Point = namedtuple(&#39;PointExtension&#39;, [&#39;x&#39;, &#39;y&#39;])
>>> p = Point(1, 2)
>>> p.__class__.__name__
&#39;PointExtension&#39;
>>> p.x
1
>>> p.y
2

The above is the content of the collections usage tutorial of the Python standard library. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn