searching: '*analysis' vs 'reanalysis'

Austin Clements aclements at csail.mit.edu
Mon Jun 6 12:20:19 PDT 2016


On Mon, Jun 6, 2016 at 1:29 PM, David Bremner <david at tethera.net> wrote:

> Sebastian Fischmeister <sfischme at uwaterloo.ca> writes:
>
> >
> > I ran into this problem before as well. Storage is cheap. Notmuch could
> > index all emails with reversed text to get around some of this
> > problem. It doesn't solve the problem of *analysis*, but it's still an
> > improvement.
>
> It would probably be more useful to have brute force regexp searches on
> headers.  Austin did some experiments that sounded promising, where you
> basically postprocess the result of a xapian query with a regexp. OTOH,
> I don't know what kept him from proposing this for mainline. If it was
> just parser issues, those are probably more or less solved now, at least
> for people using xapian 1.3+
>

The experiment was specifically for regexp matching subject, but it should
work for any header we store a literal copy of in the database. The code is
here, though in its current form it builds on my custom query parser:
https://github.com/aclements/notmuch/commit/ce41b29aba4d9b84e2f1eb6ed8df67065196c960.
Based on my understanding of Xapian 1.3+ field processors, these days it
should be quite easy to hook the PostingSource in that commit into the
Xapian QueryProcessor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20160606/fd5f74a8/attachment.html>


More information about the notmuch mailing list