Home >Backend Development >Python Tutorial >Error updating list while looping in Python

Error updating list while looping in Python

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBforward
2024-02-22 13:07:03884browse

在 Python 中循环时更新列表时出错

Question content

Why is the list "spans" never updated? I don't understand why the code gets stuck in an infinite loop.

pdf: https://www.sil.org/system/files/reapdata/62/99/18/62991811720566250411942290005522370655/40337_02.pdf

"Block" example: https://jumpshare.com/s/y393jobqjfiye51gkexn

import fitz

doc = fitz.open("cubeo/40337_02.pdf")
page = doc[3]

blocks = page.get_text("dict", flags = fitz.TEXTFLAGS_TEXT)["blocks"]
for block in blocks: 
    entries = []
    if len(block["lines"]) > 3: # ignora legendas e número de página
        for line in block["lines"]: 
            spans = []
            for span in line["spans"]:
                spans.append({"text": span["text"].replace("�", " "), "size": int(span["size"]), "font": span["font"]})

            # While there are spans left
            while True:
                # Delimits where an entry starts
                entry_first_position = None
                for i, span in enumerate(spans):
                    if span["font"] == "Sb&cuSILCharis-Bold":
                        entry_first_position = i
                        break
                if entry_first_position is not None:
                    # Delimits where an entry ends
                    entry_last_position = None
                    for i, span in enumerate(spans[entry_first_position:], start=entry_first_position):
                        if span["font"] == "Sb&cuSILCharis-Bold":
                            entry_last_position = i
                            break
                    if entry_last_position is not None:
                        # Whole entry is added as a list
                        append_list = spans[entry_first_position:entry_last_position]
                        entries.append(append_list)
                        spans = spans[:entry_first_position] + spans[entry_last_position:]
                    else:
                        break
                else:
                    break
             print(spans)

What I expect is that print(spans) outputs "[]". However, the code never reaches this point.


Correct answer


for i, span in enumerate(spans[entry_first_position:], start=entry_first_position):

Will not skip the first occurrence of span["font"] == "sb&cusilcharis-bold". So entry_last_position == entry_first_position , nothing is deleted and you're stuck in an infinite loop. Change it to

for i, span in enumerate(spans[entry_first_position+1:], start=entry_first_position+1):

So it starts looking at the next position in the list

The above is the detailed content of Error updating list while looping in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete