search
HomeBackend DevelopmentPython TutorialAn in-depth analysis of yield and generators in Python

An in-depth analysis of yield and generators in Python

Aug 16, 2017 pm 01:13 PM
generatorspythonyield

The generator and yield keyword may be one of the most powerful and difficult to understand concepts in Python (perhaps none), but it does not prevent yield from becoming the most powerful keyword in Python, for beginners It is indeed very difficult to understand. Let's read an article written by a foreign expert about yield to help you quickly understand yield. The article is a bit long, so please be patient and read it to the end. There are some examples along the way, so you won’t feel bored.

Generator

A generator is a function composed of one or more yield expressions. Each generator is an iterator (but an iterator is not necessarily a generator).

If a function contains the yield keyword, the function will become a generator.

The generator does not return all results at once, but returns the corresponding results each time it encounters the yield keyword, and retains the current running status of the function, waiting for the next call.

Since the generator is also an iterator, it should support the next method to get the next value. (You can also use the .__next__() attribute, which is .next() in python2)

Coroutines and subroutines

We call an ordinary Python function When executing, execution generally starts from the first line of code of the function and ends with the return statement, exception or function end (can be regarded as an implicit return of None). Once the function returns control to the caller, it's all over. All work done in the function and data saved in local variables will be lost. When this function is called again, everything will be created from scratch.

This is a pretty standard process for functions discussed in computer programming. Such a function can only return a single value, but sometimes it is helpful to create a function that produces a sequence. To do this, such a function needs to be able to "save its own work".

I said that the reason for being able to "produce a sequence" is that our function does not return in the usual sense. The implicit meaning of return is that the function is returning control of the executed code to the place where the function was called. The implicit meaning of "yield" is that the transfer of control is temporary and voluntary, and our function will take back control in the future.

In Python, a "function" with this ability is called a generator, and it is very useful. Generators (and thus the yield statement) were originally introduced to make it easier for programmers to write code that produces sequences of values. Previously, to implement something like a random number generator, you would implement a class or module that generated data while keeping track of the state between each call. With the introduction of generators, this becomes very easy.

In order to better understand the problem solved by the generator, let us look at an example. As we work through this example, always keep in mind the problem we need to solve: generating a sequence of values.

Note: Outside of Python, the simplest generators are what are called coroutines. In this article, I will use this term. Remember, in Python’s concept, the coroutines mentioned here are generators. The formal term for Python is generator; coroutine is just for discussion and has no formal definition at the language level.

Example: Interesting Prime Numbers

Suppose your boss asks you to write a function. The input parameter is a list of ints and returns an iterable result containing the prime number 1.

Remember, iterator (Iterable) is just the ability of an object to return specific members each time.

You must think "this is very simple", and then quickly write the following code:

def get_primes(input_list):
    result_list = list()
    for element in input_list:
        if is_prime(element):
            result_list.append()
    return result_list
# 或者更好一些的...
def get_primes(input_list):
    return (element for element in input_list if is_prime(element))
# 下面是 is_prime 的一种实现...
def is_prime(number):
    if number > 1:
        if number == 2:
            return True
        if number % 2 == 0:
            return False
        for current in range(3, int(math.sqrt(number) + 1), 2):
            if number % current == 0: 
                return False
        return True
    return False

The above implementation of is_prime fully meets the needs, so we tell the boss that it is done. She reported back that our function worked fine and was exactly what she wanted.

Handling infinite sequences

Oh, is that really true? A few days later, the boss came over and told us that she had encountered some small problems: she planned to use our get_primes function for a large list containing numbers. In fact, this list is so large that just creating it would use up all the system's memory. To this end, she hopes to bring a start parameter when calling the get_primes function and return all prime numbers greater than this parameter (perhaps she wants to solve Project Euler problem 10).

Let's take a look at this new requirement. Obviously it is impossible to simply modify get_primes. Naturally, it is impossible to return a list containing all the prime numbers from start to infinity (although there are many useful applications for manipulating infinite sequences). It seems that the possibility of using ordinary functions to deal with this problem is relatively slim.

Before we give up, let’s identify the core obstacle, what prevents us from writing functions that meet the boss’s new needs. After thinking, we came to the conclusion that the function has only one chance to return the result, so it must return all the results at once. It seems pointless to come to such a conclusion; "Isn't that how functions work?" That's what we usually think. However, you can’t succeed if you don’t learn, and you don’t know if you don’t ask. “What if they are not so?”

想象一下,如果get_primes可以只是简单返回下一个值,而不是一次返回全部的值,我们能做什么?我们就不再需要创建列表。没有列表,就没有内存的问题。由于老板告诉我们的是,她只需要遍历结果,她不会知道我们实现上的区别。

不幸的是,这样做看上去似乎不太可能。即使是我们有神奇的函数,可以让我们从n遍历到无限大,我们也会在返回第一个值之后卡住:

def get_primes(start):
    for element in magical_infinite_range(start):
        if is_prime(element):
            return element

假设这样去调用get_primes:

def solve_number_10():
    # She *is* working on Project Euler #10, I knew it!
    total = 2
    for next_prime in get_primes(3):
        if next_prime < 2000000:
            total += next_prime
        else:
            print(total)
            return

显然,在get_primes中,一上来就会碰到输入等于3的,并且在函数的第4行返回。与直接返回不同,我们需要的是在退出时可以为下一次请求准备一个值。

不过函数做不到这一点。当函数返回时,意味着全部完成。我们保证函数可以再次被调用,但是我们没法保证说,“呃,这次从上次退出时的第4行开始执行,而不是常规的从第一行开始”。函数只有一个单一的入口:函数的第1行代码。

走进生成器

这类问题极其常见以至于Python专门加入了一个结构来解决它:生成器。一个生成器会“生成”值。创建一个生成器几乎和生成器函数的原理一样简单。

一个生成器函数的定义很像一个普通的函数,除了当它要生成一个值的时候,使用yield关键字而不是return。如果一个def的主体包含yield,这个函数会自动变成一个生成器(即使它包含一个return)。除了以上内容,创建一个生成器没有什么多余步骤了。

生成器函数返回生成器的迭代器。这可能是你最后一次见到“生成器的迭代器”这个术语了, 因为它们通常就被称作“生成器”。要注意的是生成器就是一类特殊的迭代器。作为一个迭代器,生成器必须要定义一些方法(method),其中一个就是__next__()【注意: 在python2中是: next() 方法】。如同迭代器一样,我们可以使用next()函数来获取下一个值。

为了从生成器获取下一个值,我们使用next()函数,就像对付迭代器一样。

(next()会操心如何调用生成器的__next__()方法)。既然生成器是一个迭代器,它可以被用在for循环中。

每当生成器被调用的时候,它会返回一个值给调用者。在生成器内部使用yield来完成这个动作(例如yield 7)。为了记住yield到底干了什么,最简单的方法是把它当作专门给生成器函数用的特殊的return(加上点小魔法)。**

yield就是专门给生成器用的return(加上点小魔法)。

下面是一个简单的生成器函数:

>>> def simple_generator_function():
>>>    yield 1
>>>    yield 2
>>>    yield 3

这里有两个简单的方法来使用它:

>>> for value in simple_generator_function():
>>>     print(value)
1
2
3
>>> our_generator = simple_generator_function()
>>> next(our_generator)
1
>>> next(our_generator)
2
>>> next(our_generator)
3

魔法?

那么神奇的部分在哪里?我很高兴你问了这个问题!当一个生成器函数调用yield,生成器函数的“状态”会被冻结,所有的变量的值会被保留下来,下一行要执行的代码的位置也会被记录,直到再次调用next()。一旦next()再次被调用,生成器函数会从它上次离开的地方开始。如果永远不调用next(),yield保存的状态就被无视了。

我们来重写get_primes()函数,这次我们把它写作一个生成器。注意我们不再需要magical_infinite_range函数了。使用一个简单的while循环,我们创造了自己的无穷串列。

def get_primes(number):
    while True:
        if is_prime(number):
            yield number
        number += 1

如果生成器函数调用了return,或者执行到函数的末尾,会出现一个StopIteration异常。 这会通知next()的调用者这个生成器没有下一个值了(这就是普通迭代器的行为)。这也是这个while循环在我们的get_primes()函数出现的原因。如果没有这个while,当我们第二次调用next()的时候,生成器函数会执行到函数末尾,触发StopIteration异常。一旦生成器的值用完了,再调用next()就会出现错误,所以你只能将每个生成器的使用一次。下面的代码是错误的:

>>> our_generator = simple_generator_function()
>>> for value in our_generator:
>>>     print(value)
>>> # 我们的生成器没有下一个值了...
>>> print(next(our_generator))
Traceback (most recent call last):
  File "<ipython-input-13-7e48a609051a>", line 1, in <module>
    next(our_generator)
StopIteration
>>> # 然而,我们总可以再创建一个生成器
>>> # 只需再次调用生成器函数即可
>>> new_generator = simple_generator_function()
>>> print(next(new_generator)) # 工作正常
1

因此,这个while循环是用来确保生成器函数永远也不会执行到函数末尾的。只要调用next()这个生成器就会生成一个值。这是一个处理无穷序列的常见方法(这类生成器也是很常见的)。

执行流程

让我们回到调用get_primes的地方:solve_number_10。

def solve_number_10():
    # She *is* working on Project Euler #10, I knew it!
    total = 2
    for next_prime in get_primes(3):
        if next_prime < 2000000:
            total += next_prime
        else:
            print(total)
            return

我们来看一下solve_number_10的for循环中对get_primes的调用,观察一下前几个元素是如何创建的有助于我们的理解。当for循环从get_primes请求第一个值时,我们进入get_primes,这时与进入普通函数没有区别。

进入第三行的while循环

停在if条件判断(3是素数)

通过yield将3和执行控制权返回给solve_number_10

接下来,回到insolve_number_10:

for循环得到返回值3

for循环将其赋给next_prime

total加上next_prime

for循环从get_primes请求下一个值

这次,进入get_primes时并没有从开头执行,我们从第5行继续执行,也就是上次离开的地方。

def get_primes(number):
    while True:
        if is_prime(number):
            yield number
        number += 1 # <<<<<<<<<<

最关键的是,number还保持我们上次调用yield时的值(例如3)。记住,yield会将值传给next()的调用方,同时还会保存生成器函数的“状态”。接下来,number加到4,回到while循环的开始处,然后继续增加直到得到下一个素数(5)。我们再一次把number的值通过yield返回给solve_number_10的for循环。这个周期会一直执行,直到for循环结束(得到的素数大于2,000,000)。

总结

关键点:

generator是用来产生一系列值的

yield则像是generator函数的返回结果

yield唯一所做的另一件事就是保存一个generator函数的状态

generator就是一个特殊类型的迭代器(iterator)

和迭代器相似,我们可以通过使用next()来从generator中获取下一个值

通过隐式地调用next()来忽略一些值

The above is the detailed content of An in-depth analysis of yield and generators in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Python vs. C  : Understanding the Key DifferencesPython vs. C : Understanding the Key DifferencesApr 21, 2025 am 12:18 AM

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Python vs. C  : Which Language to Choose for Your Project?Python vs. C : Which Language to Choose for Your Project?Apr 21, 2025 am 12:17 AM

Choosing Python or C depends on project requirements: 1) If you need rapid development, data processing and prototype design, choose Python; 2) If you need high performance, low latency and close hardware control, choose C.

Reaching Your Python Goals: The Power of 2 Hours DailyReaching Your Python Goals: The Power of 2 Hours DailyApr 20, 2025 am 12:21 AM

By investing 2 hours of Python learning every day, you can effectively improve your programming skills. 1. Learn new knowledge: read documents or watch tutorials. 2. Practice: Write code and complete exercises. 3. Review: Consolidate the content you have learned. 4. Project practice: Apply what you have learned in actual projects. Such a structured learning plan can help you systematically master Python and achieve career goals.

Maximizing 2 Hours: Effective Python Learning StrategiesMaximizing 2 Hours: Effective Python Learning StrategiesApr 20, 2025 am 12:20 AM

Methods to learn Python efficiently within two hours include: 1. Review the basic knowledge and ensure that you are familiar with Python installation and basic syntax; 2. Understand the core concepts of Python, such as variables, lists, functions, etc.; 3. Master basic and advanced usage by using examples; 4. Learn common errors and debugging techniques; 5. Apply performance optimization and best practices, such as using list comprehensions and following the PEP8 style guide.

Choosing Between Python and C  : The Right Language for YouChoosing Between Python and C : The Right Language for YouApr 20, 2025 am 12:20 AM

Python is suitable for beginners and data science, and C is suitable for system programming and game development. 1. Python is simple and easy to use, suitable for data science and web development. 2.C provides high performance and control, suitable for game development and system programming. The choice should be based on project needs and personal interests.

Python vs. C  : A Comparative Analysis of Programming LanguagesPython vs. C : A Comparative Analysis of Programming LanguagesApr 20, 2025 am 12:14 AM

Python is more suitable for data science and rapid development, while C is more suitable for high performance and system programming. 1. Python syntax is concise and easy to learn, suitable for data processing and scientific computing. 2.C has complex syntax but excellent performance and is often used in game development and system programming.

2 Hours a Day: The Potential of Python Learning2 Hours a Day: The Potential of Python LearningApr 20, 2025 am 12:14 AM

It is feasible to invest two hours a day to learn Python. 1. Learn new knowledge: Learn new concepts in one hour, such as lists and dictionaries. 2. Practice and exercises: Use one hour to perform programming exercises, such as writing small programs. Through reasonable planning and perseverance, you can master the core concepts of Python in a short time.

Python vs. C  : Learning Curves and Ease of UsePython vs. C : Learning Curves and Ease of UseApr 19, 2025 am 12:20 AM

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

Atom editor mac version download

Atom editor mac version download

The most popular open source editor