2021年7月16日 星期五

Python imaplib 讀取 gmail 內文及附件(python gmail attachment download)



想要自動化取得 gmail 的附件,再自動做其他用途。

可以用 imaplib 方式

未完成功能:message_from_string 取得的 subject 和 from 轉成 utf-8

程式碼:

import imaplib
import base64
import os
import email

1.取得連線到 Gmail 的 mail 物件及找到的郵件

def searchGmail(user, pwd, box, search_str):
    mail = imaplib.IMAP4_SSL("imap.gmail.com",993)
    mail.login(user, pwd)
    mail.select(box)
    
    mail.literal = search_str.encode("utf-8")
    stat,data = mail.search("utf-8", "X-GM-RAW")
    if stat == "OK":
        return mail,data
    else:
        return False,False

2.取得指定伺服器的 mail 物件及找到的郵件

imapSearchStr寫法

https://afterlogic.com/mailbee-net/docs/MailBee.ImapMail.Imap.Search_overload_2.html

def searchTypicalMail(mailServer, serverPort, user, pwd, box, imapSearchStr):
    mail = imaplib.IMAP4_SSL(mailServer,serverPort)
    mail.login(user, pwd)
    mail.select(box)
    stat = False
    data = ''
    if imapSearchStr:
        stat,data = mail.search(None, imapSearchStr)
    else:
        stat,data = mail.search(None, 'ALL')

    if stat == "OK":
        return mail,data
    else:
        return False,False

3.取出找到的郵件的id list

例如:

[b'47762', b'56351', b'64948']

def getMailIDs(data):
    mail_ids = data[0]
    return mail_ids.split()

4.依據郵件的 id 從 mail 物件取得郵件

def fetchMailByID(mail, id):
    type, data = mail.fetch(id, '(RFC822)')

    if type == "OK":
        raw_email = data[0][1]
        raw_email_str = raw_email.decode('utf-8')
        email_msg = email.message_from_string(raw_email_str)
        return email_msg
    else:
        return False

待完成:

email_msg['subject']

email_msg['from']

轉成 UTF-8

5.取得郵件內容

如果是檔案,則回傳檔名list

content_type有以下種類:

text:純文字部份

html:HTML 部份

file:儲存附件檔案


def getMailContent(email_msg, content_type):
    fileNameList = []
    for part in email_msg.walk():
        if part.get_content_type() == 'text/plain':
            if content_type == 'text':
                return part.get_payload(None,True).decode('utf-8')
            else:
                continue
        if part.get_content_type() == 'text/html':
            if content_type == 'html':
                return part.get_payload(None,True).decode('utf-8')
            else:
                continue
        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-disposition') is None:
            continue
        if content_type == 'file':
            fileName = part.get_filename()
            var1 = part.values

            if bool(fileName):
                filePath = os.path.join('./',fileName)
        
                if not os.path.isfile(filePath):
                    fp = open(filePath, 'wb')
                    fp.write(part.get_payload(decode=True))
                    fp.close()
                    fileNameList.append(fileName)
                else:
                    print("file exist")
                    fileNameList.append(False)
            else:
                print("no fileName")
    return fileNameList

使用方式:

只用在gmail,

參數 search_str 就是在 gmail 網頁搜尋列同樣的搜尋格式
result_mail, data = searchGmail("me@gmail.com", "abcdefghijklmn", 'Inbox', u"in:inbox after:2021/07/15 郵件標題搜尋字")

用指定的 mail server,缺點是在 gmail 無法搜尋 utf-8 ,例如中文字

下列是找出從 friend@gmail.com 寄出標題有 Hello 的郵件

result_mail, data = searchTypicalMail("imap.gmail.com", 993, "me@gmail.com", "abcdefghijklmn", 'Inbox', '(FROM "friend@gmail.com" SUBJECT "Hello")')

取出所有 id

mail_ids = getMailIDs(data)

取出最新一封email 資料
msg = fetchMailByID(result_mail, mail_ids[-1])

取得附件檔案
result = getMailContent(msg,'file')

 

參考資料:

To read emails and download attachments in Python

https://medium.com/@sdoshi579/to-read-emails-and-download-attachments-in-python-6d7d6b60269


用 literals 傳送搜尋條件
https://stackoverflow.com/questions/5640327/python-imap-search-using-a-subject-encoded-with-iso-8859-1


How can I get an email message's text content using Python?

https://stackoverflow.com/questions/1463074/how-can-i-get-an-email-messages-text-content-using-python


可搭配 Gmail 使用的搜尋運算子

https://support.google.com/mail/answer/7190?hl=en.


Imap.Search Method (Boolean, String, String)
https://afterlogic.com/mailbee-net/docs/MailBee.ImapMail.Imap.Search_overload_2.html

 

Python IMAP Search from or to designated email address
https://stackoverflow.com/questions/10563218/python-imap-search-from-or-to-designated-email-address


沒有留言:

張貼留言