首頁 >後端開發 >Python教學 >簡單介紹Python實作郵件自動下載的範例

簡單介紹Python實作郵件自動下載的範例

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB轉載
2022-08-17 18:01:002598瀏覽

本篇文章為大家帶來了關於Python的相關知識,詳細介紹瞭如何利用Python語言實現郵件自動下載以及附件解析功能,文中的示例代碼講解詳細,感下面一起來看一下,希望對大家有幫助。

簡單介紹Python實作郵件自動下載的範例

【相關推薦:Python3影片教學

開始碼程式碼之前,我們先來了解三種郵件服務協議:

1、SMTP協定

SMTP(Simple Mail Transfer Protocol),即簡單郵件傳輸協定。相當於中轉站,將郵件傳送到客戶端。

2、POP3協定

POP3(Post Office Protocol 3),即郵局協定的第3個版本,是電子郵件的第一個離線協定標準。該協定把郵件下載到本機,不與伺服器同步,缺點是更容易遺失郵件或多次下載相同的郵件。

3、IMAP協定

IMAP(Internet Mail Access Protocol),即互動式郵件存取協定。此協定連接遠端郵箱直接操作,與伺服器內容同步。

然後介紹email包

這個包的中心元件是代表電子郵件訊息的「物件模型」。應用程式主要透過在 message 子模組中定義的物件模型介面與這個套件進行互動。應用程式可以使用此 API 來詢問有關現有電子郵件的問題、建構新的電子郵件,或新增或移除自身也使用相同物件模型介面的電子郵件子元件。也就是說,遵循電子郵件訊息及其 MIME 子元件的性質,電子郵件物件模型是所有提供 EmailMessage API 的物件所構成的樹狀結構。

接下來我們透過特定的程式碼實作一個登入郵箱客戶端,下載郵件,解析郵件附件內容的功能。

首先我們需要定義一個郵件解析的類,該類別需要三個變數:

1、郵件信箱所屬的imap服務位址;

2、郵件帳號;

3、信箱密碼【註:不同信箱需要不同的安全性原則,例如qq信箱需要簡訊驗證,取得登入授權碼,而不是明文密碼去登入遠端客戶端】

class Email_parse:

    def __init__(self,remote_server_url,email_url,password):
    	# imap服务地址
        self.remote_server_url = remote_server_url
        # 邮箱账号
        self.email_url = email_url
        # 邮箱密码
       self.password = password

然後定義類別中入口函數,登入遠程,預設取得第一頁所有的郵件。我們取得郵件的主題,並列印出來【不同郵件主題的編碼可能不同,二進位需要轉碼才能正確顯示】

    def main_parse_Email(self):
        """入口函数,登录imap服务"""
        server = imaplib.IMAP4_SSL(self.remote_server_url, 993)
        server.login(self.email_url, self.password)
        server.select('INBOX')
        status,data = server.search(None,"ALL")
        if status != 'OK':
            raise Exception('read email error')
        emailids = data[0].split()
        mail_counts = len(emailids)
        print("count:",mail_counts)
        # 邮件的遍历是按时间从后往前,这里我们选择最新的一封邮件
        for i in range(mail_counts - 1, mail_counts - 2, -1):
            status, edata = server.fetch(emailids[i], '(RFC822)')
            msg = email.message_from_bytes(edata[0][1])
            #获取邮件主题title
            subject = email.header.decode_header(msg.get('subject'))
            if type(subject[-1][0]) == bytes:
                title = subject[-1][0].decode(str(subject[-1][1]))
            elif type(subject[-1][0]) == str:
                title = subject[-1][0]
            print("title:", title)

其中,msg變數保存的就是郵件的主體,接下來因為會重複用到msg和tilte,我們將建構一個類別函數傳回msg和title。

    def get_email_title(msg):
        subject = email.header.decode_header(msg.get('subject'))
        if type(subject[-1][0]) == bytes:
            title = subject[-1][0].decode(str(subject[-1][1]))
        elif type(subject[-1][0]) == str:
            title = subject[-1][0]
        print("title:", title)
        return title

解析郵件,我們分為兩部分,郵件正文【HTML】和附件【xlsx等】,判斷有附件,我們就保存到固定的路徑下。表格的解析不再贅述了,pandas之類的包足以搞定。

    def get_att(msg):
        """获取附件并下载"""
        filename = Email_parse.get_email_name(msg)
        for part in msg.walk():
            file_name = part.get_param("name")
            if file_name:
                data = part.get_payload(decode=True)
                if data != None:
                    att_file = open('./src/' + filename, 'wb')
                    att_file.write(data)
                    att_file.close()
                else:
                    pass

郵件正文內容,我們直接解析html,將文字內容直接儲存到.txt檔案中,方便讀取。

    def get_text_from_HTML(msg):
        """获取邮件中的html"""
        filename = Email_parse.get_email_name(msg)
        current_title = Email_parse.get_email_title(msg)
        print("filename:",filename,type(filename))
        for part in msg.walk():
            if not part.is_multipart():
                result = part.get_payload(decode=True)
                result = result.decode('gbk')
                f = open(f'./src/{current_title}.txt','w')
                f.write(result)
                f.close()
                return result

完整程式碼如下:

import email
import imaplib
from email.header import decode_header
import pandas as pd
import datetime


class Email_parse:
    def __init__(self,remote_server_url,email_url,password):
        self.remote_server_url = remote_server_url
        self.email_url = email_url
        self.password = password

    def get_att(msg):
        filename = Email_parse.get_email_name(msg)
        for part in msg.walk():
            file_name = part.get_param("name")
            if file_name:
                data = part.get_payload(decode=True)
                if data != None:
                    att_file = open('./src/' + filename, 'wb')
                    att_file.write(data)
                    att_file.close()
                else:
                    pass

    def get_email_title(msg):
        subject = email.header.decode_header(msg.get('subject'))
        if type(subject[-1][0]) == bytes:
            title = subject[-1][0].decode(str(subject[-1][1]))
        elif type(subject[-1][0]) == str:
            title = subject[-1][0]
        print("title:", title)
        return title


    def get_email_name(msg):
        for part in msg.walk():
            file_name = part.get_param("name")
            if file_name:
                h = email.header.Header(file_name)
                dh = email.header.decode_header(h)
                filename = dh[0][0]
                if dh[0][1]:
                    value, charset = decode_header(str(filename, dh[0][1]))[0]
                    if charset:
                        filename = value.decode(charset)
                        print("附件名称:", filename)
                        return filename


    def main_parse_Email(self):
        server = imaplib.IMAP4_SSL(self.remote_server_url, 993)
        server.login(self.email_url, self.password)
        server.select('INBOX')
        status,data = server.search(None,"ALL")
        if status != 'OK':
            raise Exception('read email error')
        emailids = data[0].split()
        mail_counts = len(emailids)
        print("count:",mail_counts)
        for i in range(mail_counts - 1, mail_counts - 2, -1):
            status, edata = server.fetch(emailids[i], '(RFC822)')
            msg = email.message_from_bytes(edata[0][1])
            subject = email.header.decode_header(msg.get('subject'))
            if type(subject[-1][0]) == bytes:
                title = subject[-1][0].decode(str(subject[-1][1]))
            elif type(subject[-1][0]) == str:
                title = subject[-1][0]
            print("title:", title)
            Email_parse.get_att(msg)
            Email_parse.get_text_from_HTML(msg)


    def get_text_from_HTML(msg):
        filename = Email_parse.get_email_name(msg)
        current_title = Email_parse.get_email_title(msg)
        print("filename:",filename,type(filename))
        for part in msg.walk():
            if not part.is_multipart():
                result = part.get_payload(decode=True)
                result = result.decode('gbk')
                f = open(f'./src/{current_title}.txt','w')
                f.write(result)
                f.close()
                return result

if __name__ == "__main__":
    remote_server_url = 'imap.qq.com'
    email_url = "*********@qq.com"
    password = "**********"
    demo = Email_parse(remote_server_url,email_url,password)
    demo.main_parse_Email()

運行結果:

#【相關推薦:Python3影片教學

以上是簡單介紹Python實作郵件自動下載的範例的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文轉載於:jb51.net。如有侵權,請聯絡admin@php.cn刪除