[RFC PATCH 2/4] Add NOTMUCH_MESSAGE_FLAG_EXCLUDED flag

Austin Clements amdragon at MIT.EDU
Mon Jan 23 18:45:21 PST 2012


The overall structure of this series looks great.  There's obviously a
lot of clean up to do, but I'll reply with a few high-level comments.

Quoth Mark Walters on Jan 24 at  1:18 am:
> Form excluded doc_ids set and use that to exclude messages.
> Should be no functional change.
> 
> ---
>  lib/notmuch-private.h |    1 +
>  lib/query.cc          |   28 ++++++++++++++++++++++++++--
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> index 7bf153e..e791bb0 100644
> --- a/lib/notmuch-private.h
> +++ b/lib/notmuch-private.h
> @@ -401,6 +401,7 @@ typedef struct _notmuch_message_list {
>   */
>  struct visible _notmuch_messages {
>      notmuch_bool_t is_of_list_type;
> +    notmuch_doc_id_set_t *excluded_doc_ids;
>      notmuch_message_node_t *iterator;
>  };
>  
> diff --git a/lib/query.cc b/lib/query.cc
> index c25b301..92fa834 100644
> --- a/lib/query.cc
> +++ b/lib/query.cc
> @@ -57,6 +57,11 @@ struct visible _notmuch_threads {
>      notmuch_doc_id_set_t match_set;
>  };
>  
> +static notmuch_bool_t
> +_notmuch_doc_id_set_init (void *ctx,
> +			  notmuch_doc_id_set_t *doc_ids,
> +			  GArray *arr);
> +
>  notmuch_query_t *
>  notmuch_query_create (notmuch_database_t *notmuch,
>  		      const char *query_string)
> @@ -173,6 +178,7 @@ notmuch_query_search_messages (notmuch_query_t *query)
>  						   "mail"));
>  	Xapian::Query string_query, final_query, exclude_query;
>  	Xapian::MSet mset;
> +	Xapian::MSetIterator iterator;
>  	unsigned int flags = (Xapian::QueryParser::FLAG_BOOLEAN |
>  			      Xapian::QueryParser::FLAG_PHRASE |
>  			      Xapian::QueryParser::FLAG_LOVEHATE |
> @@ -193,8 +199,21 @@ notmuch_query_search_messages (notmuch_query_t *query)
>  
>  	exclude_query = _notmuch_exclude_tags (query, final_query);
>  
> -	final_query = Xapian::Query (Xapian::Query::OP_AND_NOT,
> -					 final_query, exclude_query);
> +	enquire.set_weighting_scheme (Xapian::BoolWeight());
> +	enquire.set_query (exclude_query);
> +
> +	mset = enquire.get_mset (0, notmuch->xapian_db->get_doccount ());
> +
> +	GArray *excluded_doc_ids = g_array_new (FALSE, FALSE, sizeof (unsigned int));
> +
> +	for (iterator = mset.begin (); iterator != mset.end (); iterator++)
> +	{
> +	    unsigned int doc_id = *iterator;
> +	    g_array_append_val (excluded_doc_ids, doc_id);
> +	}
> +	messages->base.excluded_doc_ids = talloc (query, _notmuch_doc_id_set);
> +	_notmuch_doc_id_set_init (query, messages->base.excluded_doc_ids,
> +				  excluded_doc_ids);

This might be inefficient for message-only queries, since it will
fetch *all* excluded docids.  This highlights a basic difference
between message and thread search: thread search can return messages
that don't match the original query and hence needs to know all
potentially excluded messages, while message search can only return
messages that match the original query.

It's entirely possible this doesn't matter because Xapian probably
still needs to fetch the full posting lists of the excluded terms, but
it would be worth doing a quick/hacky benchmark to verify this, with
enough excluded messages to make the cost non-trivial.

If it does matter, you could pass in a flag indicating if the exclude
query should be limited by the original query or not.  Or you could do
the limited exclude query in notmuch_query_search_messages and a
separate open-ended exclude query in notmuch_query_search_threads.

>  
>  	enquire.set_weighting_scheme (Xapian::BoolWeight());
>  
> @@ -294,6 +313,11 @@ _notmuch_mset_messages_move_to_next (notmuch_messages_t *messages)
>      mset_messages = (notmuch_mset_messages_t *) messages;
>  
>      mset_messages->iterator++;
> +
> +    while ((mset_messages->iterator != mset_messages->iterator_end) &&
> +	   (_notmuch_doc_id_set_contains (messages->excluded_doc_ids,
> +					  *mset_messages->iterator)))
> +	mset_messages->iterator++;

This seemed a little weird, since you remove it in the next patch.  Is
this just to keep the tests happy?  (If so, it would be worth
mentioning in the commit message; other reviewers will definitely have
the same question.)

>  }
>  
>  static notmuch_bool_t


More information about the notmuch mailing list