Detailed explanation of defaultdict in Python (code example)-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

Detailed explanation of defaultdict in Python (code example)

不言

Oct 25, 2018 pm 05:34 PM

defaultdictpython

This article brings you a detailed explanation (code example) of defaultdict in Python. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Default values can be very convenient

As we all know, in Python, if you access a key that does not exist in the dictionary, a KeyError exception will be raised (in JavaScript, if a certain key does not exist in the object attribute, returns undefined). But sometimes it is very convenient to have a default value for every key in the dictionary. For example, the following example:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    counts[kw] += 1

This example counts the number of times a word appears in strings and records it in the counts dictionary. Every time a word appears, the value stored in the key corresponding to counts is incremented by 1. But in fact, running this code will throw a KeyError exception. The timing of occurrence is when each word is counted for the first time. Because there is no default value in Python's dict, it can be verified in the Python command line:

>>> counts = dict()
>>> counts
{}
>>> counts[&#39;puppy&#39;] += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: &#39;puppy&#39;

Use judgment statements to check

In this case, the first method that may come to mind is to store the default value of 1 in the corresponding key in counts when the word is counted for the first time. This requires adding a judgment statement during processing:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    if kw not in counts:
        counts[kw] = 1
    else:
        counts[kw] += 1
# counts:
# {&#39;puppy&#39;: 5, &#39;weasel&#39;: 1, &#39;kitten&#39;: 2}

Use the dict.setdefault() method

You can also set the default value through the dict.setdefault() method:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    counts.setdefault(kw, 0)
    counts[kw] += 1

The dict.setdefault() method receives two parameters. The first parameter is the name of the key, and the second parameter is the default value. If the given key does not exist in the dictionary, the default value provided in the parameter is returned; otherwise, the value saved in the dictionary is returned. The code in the for loop can be rewritten using the return value of the dict.setdefault() method to make it more concise:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    counts[kw] = counts.setdefault(kw, 0) + 1

Use the collections.defaultdict class

Although the above method is to a certain extent This solves the problem that there is no default value in dict, but at this time we will wonder, is there a dictionary that itself provides the function of default value? The answer is yes, it is collections.defaultdict.

The defaultdict class is like a dict, but it is initialized using a type:

>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> dd
defaultdict(<type &#39;list&#39;>, {})

The initialization function of the defaultdict class accepts a type as a parameter. When the key being accessed does not exist, it can be instantiated. Change a value as the default value:

>>> dd[&#39;foo&#39;]
[]
>>> dd
defaultdict(<type &#39;list&#39;>, {&#39;foo&#39;: []})
>>> dd[&#39;bar&#39;].append(&#39;quux&#39;)
>>> dd
defaultdict(<type &#39;list&#39;>, {&#39;foo&#39;: [], &#39;bar&#39;: [&#39;quux&#39;]})

It should be noted that this form of default value can only be passed through dict[key] or dict.__getitem__(key)It is only valid when accessing. The reasons for this will be introduced below.

>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> &#39;something&#39; in dd
False
>>> dd.pop(&#39;something&#39;)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: &#39;pop(): dictionary is empty&#39;
>>> dd.get(&#39;something&#39;)
>>> dd[&#39;something&#39;]
[]

In addition to accepting the type name as a parameter of the initialization function, this class can also use any callable function without parameters. At that time, the return result of the function will be used as the default value, which makes the default value Values are more flexible. The following uses an example to illustrate how to use the custom function zero() without parameters as the parameter of the initialization function:

>>> from collections import defaultdict
>>> def zero():
...     return 0
...
>>> dd = defaultdict(zero)
>>> dd
defaultdict(<function zero at 0xb7ed2684>, {})
>>> dd[&#39;foo&#39;]
0
>>> dd
defaultdict(<function zero at 0xb7ed2684>, {&#39;foo&#39;: 0})

Use collections.defaultdict to solve the initial word statistics problem , the code is as follows:

from collections import defaultdict
strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = defaultdict(lambda: 0)  # 使用lambda来定义简单的函数
for s in strings:
    counts[s] += 1

How the defaultdict class is implemented

Through the above content, you must have understood the usage of the defaultdict class, so how to implement the default value in the defaultdict class What about the function? The key to this is the use of the __missing__() method:

>>> from collections import defaultdict
>>> print defaultdict.__missing__.__doc__
__missing__(key) # Called by __getitem__ for missing key; pseudo-code:
  if self.default_factory is None: raise KeyError(key)
  self[key] = value = self.default_factory()
  return value

By looking at the docstring of the __missing__() method, we can see that when using the __getitem__() method to access a non-existent key ( The form dict[key] is actually a simplified form of the __getitem__() method), which calls the __missing__() method to obtain the default value and add the key to the dictionary.

For a detailed introduction to the __missing__() method, please refer to the "Mapping Types — dict" section in the official Python documentation.

Introduced in the document, starting from version 2.5, if a subclass derived from dict defines the __missing__() method, when accessing a non-existent key, dict[key] will call the __missing__() method to obtain default value.

It can be seen from this that although dict supports the __missing__() method, this method does not exist in dict itself. Instead, this method needs to be implemented in the derived subclass. This can be easily verified:

>>> print dict.__missing__.__doc__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object &#39;dict&#39; has no attribute &#39;__missing__&#39;

At the same time, we can do further experiments, define a subclass Missing and implement the __missing__() method:

>>> class Missing(dict):
...     def __missing__(self, key):
...         return 'missing'
...
>>> d = Missing()
>>> d
{}
>>> d['foo']
'missing'
>>> d
{}

The return result reflects the __missing__( ) method does work. On this basis, we slightly modify the __missing__() method so that this subclass sets a default value for non-existent keys like the defautldict class:

>>> class Defaulting(dict):
...     def __missing__(self, key):
...         self[key] = &#39;default&#39;
...         return &#39;default&#39;
...
>>> d = Defaulting()
>>> d
{}
>>> d[&#39;foo&#39;]
&#39;default&#39;
>>> d
{&#39;foo&#39;: &#39;default&#39;}

Implementing the function of defaultdict in older versions of Python

The defaultdict class was added after version 2.5. It is not supported in some older versions, so it is necessary to implement a compatible defaultdict class for older versions. This is actually very simple. Although the performance may not be as good as the defautldict class that comes with version 2.5, it is functionally the same.

First of all, the __getitem__() method needs to call the __missing__() method when the access key fails:

class defaultdict(dict):
    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)

Secondly, the __missing__() method needs to be implemented to set the default value:

class defaultdict(dict):
    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)
    def __missing__(self, key):
        self[key] = value = self.default_factory()
        return value

Then, the initialization function of the defaultdict class __init__() needs to accept type or callable function parameters:

class defaultdict(dict):
    def __init__(self, default_factory=None, *a, **kw):
        dict.__init__(self, *a, **kw)
        self.default_factory = default_factory    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)
    def __missing__(self, key):
        self[key] = value = self.default_factory()
        return value

最后，综合以上内容，通过以下方式完成兼容新旧Python版本的代码：

try:
    from collections import defaultdictexcept ImportError:
    class defaultdict(dict):
      def __init__(self, default_factory=None, *a, **kw):
          dict.__init__(self, *a, **kw)
          self.default_factory = default_factory      def __getitem__(self, key):
          try:
              return dict.__getitem__(self, key)
          except KeyError:
              return self.__missing__(key)

      def __missing__(self, key):
          self[key] = value = self.default_factory()
          return value

The above is the detailed content of Detailed explanation of defaultdict in Python (code example). For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:CSDN. If there is any infringement, please contact admin@php.cn delete

Python vs. C : Understanding the Key DifferencesApr 21, 2025 am 12:18 AM

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Python vs. C : Which Language to Choose for Your Project?Apr 21, 2025 am 12:17 AM

Choosing Python or C depends on project requirements: 1) If you need rapid development, data processing and prototype design, choose Python; 2) If you need high performance, low latency and close hardware control, choose C.

Reaching Your Python Goals: The Power of 2 Hours DailyApr 20, 2025 am 12:21 AM

By investing 2 hours of Python learning every day, you can effectively improve your programming skills. 1. Learn new knowledge: read documents or watch tutorials. 2. Practice: Write code and complete exercises. 3. Review: Consolidate the content you have learned. 4. Project practice: Apply what you have learned in actual projects. Such a structured learning plan can help you systematically master Python and achieve career goals.

Maximizing 2 Hours: Effective Python Learning StrategiesApr 20, 2025 am 12:20 AM

Methods to learn Python efficiently within two hours include: 1. Review the basic knowledge and ensure that you are familiar with Python installation and basic syntax; 2. Understand the core concepts of Python, such as variables, lists, functions, etc.; 3. Master basic and advanced usage by using examples; 4. Learn common errors and debugging techniques; 5. Apply performance optimization and best practices, such as using list comprehensions and following the PEP8 style guide.

Choosing Between Python and C : The Right Language for YouApr 20, 2025 am 12:20 AM

Python is suitable for beginners and data science, and C is suitable for system programming and game development. 1. Python is simple and easy to use, suitable for data science and web development. 2.C provides high performance and control, suitable for game development and system programming. The choice should be based on project needs and personal interests.

Python vs. C : A Comparative Analysis of Programming LanguagesApr 20, 2025 am 12:14 AM

Python is more suitable for data science and rapid development, while C is more suitable for high performance and system programming. 1. Python syntax is concise and easy to learn, suitable for data processing and scientific computing. 2.C has complex syntax but excellent performance and is often used in game development and system programming.

2 Hours a Day: The Potential of Python LearningApr 20, 2025 am 12:14 AM

It is feasible to invest two hours a day to learn Python. 1. Learn new knowledge: Learn new concepts in one hour, such as lists and dictionaries. 2. Practice and exercises: Use one hour to perform programming exercises, such as writing small programs. Through reasonable planning and perseverance, you can master the core concepts of Python in a short time.

Python vs. C : Learning Curves and Ease of UseApr 19, 2025 am 12:20 AM

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

See all articles