[PATCH] test: Add test for searching of uncommonly encoded messages

Michal Sojka sojkam1 at fel.cvut.cz
Thu Feb 23 23:00:02 PST 2012


On Fri, 24 Feb 2012, Serge Z wrote:
> 
> Quoting Michal Sojka (2012-02-24 04:33:15)
> >Emails that are encoded differently than as ASCII or UTF-8 are not
> >indexed properly by notmuch. It is not possible to search for non-ASCII
> >words within those messages.
> 
> Ok. But we can preprocess each incoming message right after 'getmail' to
> convert it from html to text and to utf8 encoding. One solution is to create a
> seperate script for this and make gmail pipe all messages to this script, and
> then to notmuch. But It would be better if maildir contains original messages
> only, so the question is: can we make nomuch indexing engine to index
> preprocessed message while maildir will contain original message - as it was
> obtained?

Hi,

I'm not big fan of adding "preprocessor". First, I thing that both
reasons you mention are actually bugs and it would be better to fix them
for everybody than requiring each user to configure some preprocessor.
Second, depending on what and how would your preprocessor do, the
initial mail indexing could be a way slower, which is also nothing that
people want.

Do you have any other use case for the preprocessor besides utf8 and
html->text conversions?

Cheers,
-Michal


More information about the notmuch mailing list