Bug#749890: python3-notmuch: missing header in mbox message -> NullPointerError

Austin Clements amdragon at MIT.EDU
Fri Jun 27 12:36:59 PDT 2014


Quoth David Bremner on Jun 27 at 12:45 pm:
> Jakub Wilk <jwilk at debian.org> writes:
> 
> > * David Bremner <david at tethera.net>, 2014-06-26, 18:26:
> >>>0.18.1~rc0-1 is much better, thanks!
> >>>
> >>>I still get NullPointerError for one of my messages, though. :-( The 
> >>>message is in the MBOXCL format (where message body size is indicated 
> >>>by the Content-Length field), and has lines starting with "From " in 
> >>>the message body. I've attached a new test case.
> >>
> >>That message (and at a guess other MBOXCL files) is ignored as a 
> >>non-mail file by 0.18.1 "notmuch new".
> >
> > Indeed.
> >
> >>Is this another case of files which where indexed with an older version 
> >>of notmuch causing problems with a newer version?
> >
> > Yes, that's why I meant. Sorry for not being clear.
> 
> As a point of information, I bisected with the following test script:
> 
> #!/usr/bin/env bash
> test_description='"notmuch new" in several variations'
> . ./test-lib.sh
> 
> test_begin_subtest "Support single-message mbox with content length (deprecated)"
> cat > "${MAIL_DIR}"/mbox_file2 <<EOF
> From jwilk  Fri May 30 14:09:05 2014
> Subject: Hello world!
> Content-Length: 12
> Lines: 1
> 
> From world!
> 
> EOF
> output=$(NOTMUCH_NEW 2>&1)
> test_expect_equal "$output" \
> "Added 1 new message to the database."
> 
> 
> test_done
> 
> The commit where the behaviour changed to reject MBOXCL files with 
> 'From ' in the body was 610f0e09929. This was between 0.14 and 0.15.
> I'd say this was unintentional, although it isn't clear to me yet how
> easy it is fix.

Thanks for bisecting this, David.

Unfortunately, when it comes to mbox, the only winning move is not to
play.

The reason 610f0e09929 matters here is because it *added* support for
mbox (or, rather, this weird but surprisingly common chimera of
mbox-formatted message files with maildir-formatted file names).
Previously, notmuch assumed *everything* was a maildir-formatted
message file; that is, one message per file.  It "worked" for mboxcl
because it had no idea what either mbox or mboxcl was.  But it would
choke hard when it encountered a large, multi-message mbox archive
because it would try to index the whole thing as one giant email.  In
an effort to avoid this, I added explicit support for single-message
mbox files (to keep the chimerians happy).  But at that point we lost:
there simply is no way to reliably and programmatically distinguish
the many variants of mbox (see
http://www.jwz.org/doc/content-length.html for a good discussion of
this).

So, I'm afraid my best advice is to convert your mboxcl files to
something else.  Probably maildir, both because you're storing them in
a maildir (I assume?) and because it's easy: just strip off the first
line.  I don't think there's anything notmuch can do to fix this
without breaking something else.


More information about the notmuch mailing list