[notmuch] notmuch new: Memory problem

Dominik Epple dominik.epple at googlemail.com
Mon Nov 23 08:26:41 PST 2009


Hi,

2009/11/20 Carl Worth <cworth at cworth.org>:
> On Fri, 20 Nov 2009 09:56:50 +0100, Dominik Epple <dominik.epple at googlemail.com> wrote:
>> Is there a problem with the number of my mails? I currently have over
>> 40.000 Mails... they live currently in mbox files, I created a Maildir
>> with mb2md-3.20.pl.
>
> I'm suspecting that you have some big files in there, (such as indexes
> from some other mail program). We had code in notmuch to detect and
> ignore these, but a recent bug had broken that.
>
> I just fixed this code as of the below commit. So please update and try
> again and let us know if things work any better.

Ok, one of the problems seems to be solved. One can learn from the
info: output that the code actually ignores non-email data. These
files are small and fragments of real mail. Obviously the mb2md code
made errors there.

But I run in a different issue. I have a lot of files in the Maildir
which contain base64 encoded binary data. (Some remote site sends my
its daily backup logs.) Those files are all of 2.4 megabyte in size.
By adding some debug code to notmuch-new.c, I find out that the
program becomes very slow and consumes a lot of memory when adding
these files. I just killed it when it consumed 2 GByte again.

So as you suspected, the problem seems to stem from large files. But
those large files are not indices or stuff like that from different
mail programs, but they are valid emails which contain a lot of
(encoded) binary data.

Perhaps we should be able to configure notmuch such that he ignores
all mails that match specific pattern (like "Subject: Backup logs
from.*")

Regards
Dominik


More information about the notmuch mailing list