multilingual notmuch (and Content-Language)

David Bremner david at tethera.net
Sun Mar 18 11:22:47 PDT 2018


Daniel Kahn Gillmor <dkg at fifthhorseman.net> writes:

> AIUI, xapian is pretty much committed to being a single-language
> indexer.  But i just wanted to point out that it's possible that we
> could be smarter about this in notmuch, and wanted to make a space for
> possible design discussion.
>

More precisely, it uses a single _stemmer_ when generating terms and
when parsing queries. Nothing says that these have to correspond to a
single human language. The stemmer is also configured at runtime, so it
could in principle be per database configurable. I mention the
possibility of a custom stemmer because that also seems like a natural
place to put things like unicode normalization and accent removal.

d




More information about the notmuch mailing list