header continuation issue in notmuch frontend/alot/pythons email module

Justus Winter 4winter at informatik.uni-hamburg.de
Sun Jun 23 06:11:45 PDT 2013


Hi,

I recently had a problem replying to a mail written by Thomas Schwinge
using an oldish notmuch. Not sure if it has been fixed in more recent
versions, but I think notmuch could improve uppon its header
generation (see below). Problematic part of the mail:

~~~ snip ~~~
[...]
To: someone at example.org, "line
 break" <linebreak at example.org>, someoneelse at example.org
User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/23.4.1 (i486-pc-linux-gnu)
[...]
~~~ snap ~~~

http://tools.ietf.org/html/rfc2822#section-2.2.3 says:

   Note: Though structured field bodies are defined in such a way that
   folding can take place between many of the lexical tokens (and even
   within some of the lexical tokens), folding SHOULD be limited to
   placing the CRLF at higher-level syntactic breaks.  For instance, if
   a field body is defined as comma-separated values, it is recommended
   that folding occur after the comma separating the structured items in
   preference to other places where the field could be folded, even if
   it is allowed elsewhere.

So notmuch "rfc-SHOULD" place the newlines after the comma.

The rfc goes on:

   The process of moving from this folded multiple-line representation
   of a header field to its single line representation is called
   "unfolding". Unfolding is accomplished by simply removing any CRLF
   that is immediately followed by WSP.  Each header field should be
   treated in its unfolded form for further syntactic and semantic
   evaluation.

My interpretation is that unfolding simply removes any linebreaks
first, so the value does not contain any newlines. But pythons email
module discriminates quoted and unquoted parts of the value:

~~~ snip ~~~
from __future__ import print_function
import email
from email.utils import getaddresses

m = email.message_from_string('''To: "line
 break" <linebreak at example.org>, line
 break <linebreak at example.org>''')
print("m['To'] = ", m['To'])
print("getaddresses(m.get_all('To')) = ", getaddresses(m.get_all('To')))
~~~ snap ~~~

% python3 test.py
m['To'] =  "line
 break" <linebreak at example.org>, line
 break <linebreak at example.org>
getaddresses(m.get_all('To')) =  [('line\n break', 'linebreak at example.org'), ('line break', 'linebreak at example.org')]

I believe that is what's preventing me from replying to the message
using alot without sanitizing the To header first. Not really sure who
is wrong or right here... any thoughts?

Justus


More information about the notmuch mailing list