[notmuch] notmuch new: Memory problem
dominik.epple at googlemail.com
Wed Nov 25 01:39:57 PST 2009
I repeated the procedure (mb2md, notmuch new), but before, I saved all
those large emails with backup logs into a separate folder which i
deleted before "notmuch new". Then, "notmoch new" works as expected.
So the problem stems indeed from too many too large files being
present. (I actually found some being as large as 40M, not just 2.4M,
as written in previous mails.)
2009/11/23 Dominik Epple <dominik.epple at googlemail.com>:
> 2009/11/20 Carl Worth <cworth at cworth.org>:
>> On Fri, 20 Nov 2009 09:56:50 +0100, Dominik Epple <dominik.epple at googlemail.com> wrote:
>>> Is there a problem with the number of my mails? I currently have over
>>> 40.000 Mails... they live currently in mbox files, I created a Maildir
>>> with mb2md-3.20.pl.
>> I'm suspecting that you have some big files in there, (such as indexes
>> from some other mail program). We had code in notmuch to detect and
>> ignore these, but a recent bug had broken that.
>> I just fixed this code as of the below commit. So please update and try
>> again and let us know if things work any better.
> Ok, one of the problems seems to be solved. One can learn from the
> info: output that the code actually ignores non-email data. These
> files are small and fragments of real mail. Obviously the mb2md code
> made errors there.
> But I run in a different issue. I have a lot of files in the Maildir
> which contain base64 encoded binary data. (Some remote site sends my
> its daily backup logs.) Those files are all of 2.4 megabyte in size.
> By adding some debug code to notmuch-new.c, I find out that the
> program becomes very slow and consumes a lot of memory when adding
> these files. I just killed it when it consumed 2 GByte again.
> So as you suspected, the problem seems to stem from large files. But
> those large files are not indices or stuff like that from different
> mail programs, but they are valid emails which contain a lot of
> (encoded) binary data.
> Perhaps we should be able to configure notmuch such that he ignores
> all mails that match specific pattern (like "Subject: Backup logs
More information about the notmuch