首頁 >後端開發 >Python教學 >如何配置 Pytesseract 僅進行單位數字識別？

如何配置 Pytesseract 僅進行單位數字識別？

How to Configure Pytesseract for Single-Digit Number Recognition Only?

Pytesseract OCR：配置單位數和僅數字識別

Pytesseract 是一個開源OCR 庫，提供配置方面的靈活性其靈活性滿足引擎特定要求。在這種情況下，我們的目標是將 Tesseract 配置為識別單個數字，同時將其限制為數字，因為數字「0」經常被誤解為字母「O」。

問題定義

使用者在使用以下方法為此目的配置Pytesseract 時遇到困難語法：

target = pytesseract.image_to_string(im,config='-psm 7',config='outputbase digits')

設定參數

如tesseract-4.0.0a中所述，Tesseract 支援各種頁面分段模式，每種模式都有特定的特徵。為了啟用單字元識別，我們將 psm 設定為 10。此外，為了限制數字的識別，我們將 tessedit_char_whitelist 設定為僅包含所需的數字範圍 (0-9)。

target = pytesseract.image_to_string(image, lang='eng', boxes=False, \
        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

以上是如何配置 Pytesseract 僅進行單位數字識別？的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述：

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

看更多