[PATCH] mime_node_open: skip envelope from lines at the start of messages

Daniel Kahn Gillmor dkg at fifthhorseman.net
Fri Mar 9 08:20:01 PST 2012


On 03/09/2012 08:56 AM, David Bremner wrote:
> Some MDAs such as procmail (in MH mode), and exim (doing local
> delivery in some configurations of the appendfile transport) add a
> line to the front of a message with "From " followed by envelope
> sender.  Since this is not a proper RFC822 header field, gmime (at
> least since version 2.6) refuses to parse it, unless in mbox mode.
>
> This change reads the line of the file, and if they start with
> "From ", pass the stream to gmime starting from the second line.
>
> This makes mime_node_open more consistent with (but still stricter
> than) the permissive behaviour of notmuch_file_get_header
> (message-file.c), which allows a certain number of "broken_headers".
>
> We avoid putting gmime into mbox mode in case of side effects; this
> leaves the situation of mboxes accidentally indexed by notmuch the
> same as before, namely "undefined behaviour".  Ideally they should at
> least be warned by notmuch-new.  Although strict rfc822 adherence
> would be one way to detect mboxes, it doesn't seem to fit with the
> spirit or code of message-file.c.

The above justification (and the version of the associated patch without 
the memory leak and using strncmp instead of strcmp) seems good to me.

While I'd prefer to have nothing but spic-and-span, perfectly clean 
RFC2822 messages, we have (perhaps accidentally) traditionally supported 
message files with leading "From " lines, so they will be 
already-indexed by previous versions of notmuch.

This patch defines the non-MIME variance we're willing to accept quite 
narrowly (just a single leading line that starts with "From ", no 
escaping of the rest of the text), avoids breaking compatibility with 
existing indexes, and satisfies indexing some plausible MTA delivery 
configurations.

The only way it would be better is if it were to auto-detect that a file 
is actually a multi-message mbox, and alert the user to the fact that 
all but the first message in the mbox is unindexed.  But we don't 
currently do that anyway, so it's not a regression (and that additional 
cleanup should probably be a separate patch anyway).

so: LGTM.

	--dkg


More information about the notmuch mailing list