faqts : Computers : Programming : Languages : Python : Common Problems : Email

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

8 of 9 people (89%) answered Yes
Recently 4 of 5 people (80%) answered Yes

Entry

Searching code for parsing mail headers encoded according to RFC 2047 (Message Header Extensions for Non-ASCII Text).

Jul 16th, 2000 21:19
unknown unknown, François Pinard


I needed this soon after learning Python (so this is part of my first 
Python lines, I would probably write something simpler today :-) and 
quickly wrote what appears below.  However, I found out after the fact 
that the Python library had something already.  See 
mimify.mime_decode_header.
# Handling of RFC 2047 (previously RFC 1522) headers.
import re, string
def to_latin1(text):
    return _sub_f(r'=\?ISO-8859-1\?Q\?([^?]*)\?=', re.I, _replace1, 
text)
def _replace1(match):
    return _sub_f('=([0-9A-F][0-9A-F])', re.I, _replace2,
                  re.sub('_', ' ', match.group(1)))
def _replace2(match):
    return chr(string.atoi(match.group(1), 16))
def _sub_f(pattern, flags, function, text):
    matcher = re.compile(pattern, flags).search
    position = 0
    results = []
    while 1:
        match = matcher(text, position)
        if not match:
            results.append(text[position:])
            return string.joinfields(results, '')
        results.append(text[position:match.start(0)])
        position = match.end(0)
        results.append(function(match))