Home  >  Article  >  Backend Development  >  Python Data Structures: An Underrated Namedtuple (1)

Python Data Structures: An Underrated Namedtuple (1)

coldplay.xixi
coldplay.xixiforward
2020-10-19 17:45:526212browse

Python Tutorial column introduces Namedtuple in Python data structure.

Python Data Structures: An Underrated Namedtuple (1)

This article will discuss the key usage of namedtuple in python. We will introduce the concepts of namedtuple from the shallower to the deeper. You'll learn why you use them and how to use them, resulting in cleaner code. After studying this guide, you'll love using it.

Learning Objectives

At the end of this tutorial, you should be able to:

  • Understand why and when to use it
  • Convert a regular tuple and dictionary to Namedtuple
  • Convert Namedtuple to dictionary or regular tuple
  • Sort the Namedtuple list
  • Understand the difference between Namedtuple and data class (DataClass)
  • Create using optional fieldsNamedtuple
  • Place NamedtupleSerialize to JSON
  • Add documentation string (docstring)

Why use namedtuple?

namedtuple is a very interesting (and underrated) data structure. We can easily find Python code that relies heavily on regular tuples and dictionaries to store data. I'm not saying that's bad, it's just that sometimes they're often abused, let me tell you.

Suppose you have a function that converts a string to a color. Color must be represented in 4-dimensional space RGBA.

def convert_string_to_color(desc: str, alpha: float = 0.0):
    if desc == "green":        return 50, 205, 50, alpha    elif desc == "blue":        return 0, 0, 255, alpha    else:        return 0, 0, 0, alpha复制代码

Then, we can use it like this:

r, g, b, a = convert_string_to_color(desc="blue", alpha=1.0)复制代码

Okay, fine. But we have several problems here. The first is that the order of the returned values ​​cannot be guaranteed. That is, there is nothing to prevent other developers from calling

convert_string_to_color:
g, b, r, a = convert_string_to_color(desc="blue", alpha=1.0)复制代码

In addition, we may not know that the function returns 4 values ​​​​and may call the function like this:

r, g, b = convert_string_to_color(desc="blue", alpha=1.0)复制代码

So, because The return value is not enough, a ValueError error is thrown, and the call fails.

Indeed. But, you might ask, why not use a dictionary?

Python's dictionary is a very general data structure. They are a convenient way to store multiple values. However, dictionaries are not without their drawbacks. Because of their flexibility, dictionaries are easily abused. let Let's look at an example using a dictionary.

def convert_string_to_color(desc: str, alpha: float = 0.0):
    if desc == "green":        return {"r": 50, "g": 205, "b": 50, "alpha": alpha}    elif desc == "blue":        return {"r": 0, "g": 0, "b": 255, "alpha": alpha}    else:        return {"r": 0, "g": 0, "b": 0, "alpha": alpha}复制代码

Okay, we can now use it like this, expecting only one value to be returned:

color = convert_string_to_color(desc="blue", alpha=1.0)复制代码

No need to remember the order, but it has at least two disadvantages. The first is the name of the key we have to keep track of. If we change it {"r": 0, "g": 0, "b": 0, "alpha": alpha} to {"red": 0, "green" : 0, "blue": 0, "a": alpha}, you will get KeyError returned when accessing the field because the keys r, g, b and alpha no longer exists.

The second problem with dictionaries is that they are not hashable. This means we cannot store them in sets or other dictionaries. Let's say we want to track how many colors a particular image has. If we use collections.Counter to count, we will get TypeError: unhashable type: ‘dict’.

Also, dictionaries are mutable, so we can add as many new keys as we need. Trust me, these are some nasty bug spots that are hard to spot.

Okay, good. So what now? What can I use instead?

namedtuple! Yes, that's it!

Convert our function to use namedtuple:

from collections import namedtuple
...
Color = namedtuple("Color", "r g b alpha")
...def convert_string_to_color(desc: str, alpha: float = 0.0):
    if desc == "green":        return Color(r=50, g=205, b=50, alpha=alpha)    elif desc == "blue":        return Color(r=50, g=0, b=255, alpha=alpha)    else:        return Color(r=50, g=0, b=0, alpha=alpha)复制代码

As is the case with dict, we can assign the value to a single variable and use it as needed. No need to remember the order. Moreover, if you are using IDEs such as PyCharm and VSCode, you can also automatically prompt for completion.

color = convert_string_to_color(desc="blue", alpha=1.0)
...
has_alpha = color.alpha > 0.0...
is_black = color.r == 0 and color.g == 0 and color.b == 0复制代码

The most important thing is that namedtuple is immutable. If another developer on the team thinks it's a good idea to add a new field at runtime, the program will report an error.

>>> blue = Color(r=0, g=0, b=255, alpha=1.0)>>> blue.e = 0---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-8c7f9b29c633> in <module>
----> 1 blue.e = 0AttributeError: 'Color' object has no attribute 'e'复制代码

Not only that, now we can use Counter to keep track of how many colors a collection has.

>>> Counter([blue, blue])>>> Counter({Color(r=0, g=0, b=255, alpha=1.0): 2})复制代码

How to convert regular tuples or dictionaries to namedtuple

Now that we understand why namedtuple is used, it’s time to learn how to convert regular tuples and dictionaries to namedtuples. Let's say for some reason you have a dictionary instance containing colored RGBA values. If you want to convert it to Color namedtuple, you can do it as follows:

>>> c = {"r": 50, "g": 205, "b": 50, "alpha": alpha}>>> Color(**c)>>> Color(r=50, g=205, b=50, alpha=0)复制代码

We can use the ** structure to decompress the package dict is namedtuple.

If I want to create a namedtupe from dict, how to do it?

No problem, just do the following:

>>> c = {"r": 50, "g": 205, "b": 50, "alpha": alpha}>>> Color = namedtuple("Color", c)>>> Color(**c)
Color(r=50, g=205, b=50, alpha=0)复制代码

By passing the dict instance to the namedtuple factory function, it will create the fields for you. Then, Color unpacks the dictionary c as in the example above and creates a new instance.

如何将 namedtuple 转换为字典或常规元组

我们刚刚学习了如何将转换namedtupledict。反过来呢?我们又如何将其转换为字典实例?

实验证明,namedtuple它带有一种称为的方法._asdict()。因此,转换它就像调用方法一样简单。

>>> blue = Color(r=0, g=0, b=255, alpha=1.0)>>> blue._asdict()
{'r': 0, 'g': 0, 'b': 255, 'alpha': 1.0}复制代码

您可能想知道为什么该方法以_开头。这是与Python的常规规范不一致的一个地方。通常,_代表私有方法或属性。但是,namedtuple为了避免命名冲突将它们添加到了公共方法中。除了_asdict,还有_replace_fields_field_defaults。您可以在这里找到所有这些。

要将namedtupe转换为常规元组,只需将其传递给tuple构造函数即可。

>>> tuple(Color(r=50, g=205, b=50, alpha=0.1))
(50, 205, 50, 0.1)复制代码

如何对namedtuples列表进行排序

另一个常见的用例是将多个namedtuple实例存储在列表中,并根据某些条件对它们进行排序。例如,假设我们有一个颜色列表,我们需要按alpha强度对其进行排序。

幸运的是,Python允许使用非常Python化的方式来执行此操作。我们可以使用operator.attrgetter运算符。根据文档,attrgetter“返回从其操作数获取attr的可调用对象”。简单来说就是,我们可以通过该运算符,来获取传递给sorted函数排序的字段。例:

from operator import attrgetter
...
colors = [
    Color(r=50, g=205, b=50, alpha=0.1),
    Color(r=50, g=205, b=50, alpha=0.5),
    Color(r=50, g=0, b=0, alpha=0.3)
]
...>>> sorted(colors, key=attrgetter("alpha"))
[Color(r=50, g=205, b=50, alpha=0.1),
 Color(r=50, g=0, b=0, alpha=0.3),
 Color(r=50, g=205, b=50, alpha=0.5)]复制代码

现在,颜色列表按alpha强度升序排列!

如何将namedtuples序列化为JSON

有时你可能需要将储存namedtuple转为JSON。Python的字典可以通过json模块转换为JSON。那么我们可以使用_asdict方法将元组转换为字典,然后接下来就和字典一样了。例如:

>>> blue = Color(r=0, g=0, b=255, alpha=1.0)>>> import json>>> json.dumps(blue._asdict())'{"r": 0, "g": 0, "b": 255, "alpha": 1.0}'复制代码

如何给namedtuple添加docstring

在Python中,我们可以使用纯字符串来记录方法,类和模块。然后,此字符串可作为名为的特殊属性使用__doc__。话虽这么说,我们如何向我们的Color namedtuple添加docstring的?

我们可以通过两种方式做到这一点。第一个(比较麻烦)是使用包装器扩展元组。这样,我们便可以docstring在此包装器中定义。例如,请考虑以下代码片段:

_Color = namedtuple("Color", "r g b alpha")

class Color(_Color):
    """A namedtuple that represents a color.
    It has 4 fields:
    r - red
    g - green
    b - blue
    alpha - the alpha channel
    """

>>> print(Color.__doc__)
A namedtuple that represents a color.
    It has 4 fields:
    r - red
    g - green
    b - blue
    alpha - the alpha channel
>>> help(Color)
Help on class Color in module __main__:

class Color(Color)
 |  Color(r, g, b, alpha)
 |  
 |  A namedtuple that represents a color.
 |  It has 4 fields:
 |  r - red
 |  g - green
 |  b - blue
 |  alpha - the alpha channel
 |  
 |  Method resolution order:
 |      Color
 |      Color
 |      builtins.tuple
 |      builtins.object
 |  
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)复制代码

如上,通过继承_Color元组,我们为namedtupe添加了一个__doc__属性。

添加的第二种方法,直接设置__doc__属性。这种方法不需要扩展元组。

>>> Color.__doc__ = """A namedtuple that represents a color.
    It has 4 fields:
    r - red
    g - green
    b - blue
    alpha - the alpha channel
    """复制代码

注意,这些方法仅适用于Python 3+

限于篇幅,先到这下篇继续。

相关免费学习推荐:python教程(视频)

The above is the detailed content of Python Data Structures: An Underrated Namedtuple (1). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete