[PATCH 3/4] Optimize thread search using matched docid sets.

Austin Clements amdragon at MIT.EDU
Wed Nov 17 23:38:29 PST 2010


Currently this code uses a bitmap indexed by docid as a simple, fast
set structure.  This is quite memory-efficient if the docid space is
dense, even if the largest docid is quite large.  Is there a danger
that the docid space will be large and sparse?  Is it worth replacing
this with a smarter bit set structure?

Quoth myself on Nov 17 at  2:28 pm:
> This reduces thread search's 1+2t Xapian queries (where t is the
> number of matched threads) to 1+t queries and constructs exactly one
> notmuch_message_t for each message instead of 2 to 3.
> notmuch_query_search_threads eagerly fetches the docids of all
> messages matching the user query instead of lazily constructing
> message objects and fetching thread ID's from term lists.
> _notmuch_thread_create takes a seed docid and the set of all matched
> docids and uses a single Xapian query to expand this docid to its
> containing thread, using the matched docid set to determine which
> messages in the thread match the user query instead of using a second
> Xapian query.
> 
> As a side effect, this fixes author order so authors are always sorted
> by first occurrence in each thread.  This breaks two emacs tests that
> hard-code the old, buggy author order.
> 
> This reduces the amount of time required to load my inbox from 4.523
> seconds to 3.025 seconds (1.5X faster).


More information about the notmuch mailing list