header continuation issue in notmuch frontend/alot/pythons email module
Justus Winter
4winter at informatik.uni-hamburg.de
Sun Jun 23 06:11:45 PDT 2013
Hi,
I recently had a problem replying to a mail written by Thomas Schwinge
using an oldish notmuch. Not sure if it has been fixed in more recent
versions, but I think notmuch could improve uppon its header
generation (see below). Problematic part of the mail:
~~~ snip ~~~
[...]
To: someone at example.org, "line
break" <linebreak at example.org>, someoneelse at example.org
User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/23.4.1 (i486-pc-linux-gnu)
[...]
~~~ snap ~~~
http://tools.ietf.org/html/rfc2822#section-2.2.3 says:
Note: Though structured field bodies are defined in such a way that
folding can take place between many of the lexical tokens (and even
within some of the lexical tokens), folding SHOULD be limited to
placing the CRLF at higher-level syntactic breaks. For instance, if
a field body is defined as comma-separated values, it is recommended
that folding occur after the comma separating the structured items in
preference to other places where the field could be folded, even if
it is allowed elsewhere.
So notmuch "rfc-SHOULD" place the newlines after the comma.
The rfc goes on:
The process of moving from this folded multiple-line representation
of a header field to its single line representation is called
"unfolding". Unfolding is accomplished by simply removing any CRLF
that is immediately followed by WSP. Each header field should be
treated in its unfolded form for further syntactic and semantic
evaluation.
My interpretation is that unfolding simply removes any linebreaks
first, so the value does not contain any newlines. But pythons email
module discriminates quoted and unquoted parts of the value:
~~~ snip ~~~
from __future__ import print_function
import email
from email.utils import getaddresses
m = email.message_from_string('''To: "line
break" <linebreak at example.org>, line
break <linebreak at example.org>''')
print("m['To'] = ", m['To'])
print("getaddresses(m.get_all('To')) = ", getaddresses(m.get_all('To')))
~~~ snap ~~~
% python3 test.py
m['To'] = "line
break" <linebreak at example.org>, line
break <linebreak at example.org>
getaddresses(m.get_all('To')) = [('line\n break', 'linebreak at example.org'), ('line break', 'linebreak at example.org')]
I believe that is what's preventing me from replying to the message
using alot without sanitizing the To header first. Not really sure who
is wrong or right here... any thoughts?
Justus
More information about the notmuch
mailing list