[PATCH] mime_node_open: skip envelope from lines at the start of messages
Daniel Kahn Gillmor
dkg at fifthhorseman.net
Fri Mar 9 08:20:01 PST 2012
On 03/09/2012 08:56 AM, David Bremner wrote:
> Some MDAs such as procmail (in MH mode), and exim (doing local
> delivery in some configurations of the appendfile transport) add a
> line to the front of a message with "From " followed by envelope
> sender. Since this is not a proper RFC822 header field, gmime (at
> least since version 2.6) refuses to parse it, unless in mbox mode.
>
> This change reads the line of the file, and if they start with
> "From ", pass the stream to gmime starting from the second line.
>
> This makes mime_node_open more consistent with (but still stricter
> than) the permissive behaviour of notmuch_file_get_header
> (message-file.c), which allows a certain number of "broken_headers".
>
> We avoid putting gmime into mbox mode in case of side effects; this
> leaves the situation of mboxes accidentally indexed by notmuch the
> same as before, namely "undefined behaviour". Ideally they should at
> least be warned by notmuch-new. Although strict rfc822 adherence
> would be one way to detect mboxes, it doesn't seem to fit with the
> spirit or code of message-file.c.
The above justification (and the version of the associated patch without
the memory leak and using strncmp instead of strcmp) seems good to me.
While I'd prefer to have nothing but spic-and-span, perfectly clean
RFC2822 messages, we have (perhaps accidentally) traditionally supported
message files with leading "From " lines, so they will be
already-indexed by previous versions of notmuch.
This patch defines the non-MIME variance we're willing to accept quite
narrowly (just a single leading line that starts with "From ", no
escaping of the rest of the text), avoids breaking compatibility with
existing indexes, and satisfies indexing some plausible MTA delivery
configurations.
The only way it would be better is if it were to auto-detect that a file
is actually a multi-message mbox, and alert the user to the fact that
all but the first message in the mbox is unindexed. But we don't
currently do that anyway, so it's not a regression (and that additional
cleanup should probably be a separate patch anyway).
so: LGTM.
--dkg
More information about the notmuch
mailing list