[PATCH v2 07/12] lib: Internal support for querying and creating ghost messages

Austin Clements aclements at csail.mit.edu
Tue Oct 21 18:33:00 PDT 2014


Quoth Mark Walters on Oct 22 at 12:05 am:
> 
> Hi 
> 
> I am slowly working my way through this series: only two trivial queries
> so far.
> 
> On Tue, 07 Oct 2014, Austin Clements <aclements at csail.mit.edu> wrote:
> > From: Austin Clements <amdragon at mit.edu>
> >
> > This updates the message abstraction to support ghost messages: it
> > adds a message flag that distinguishes regular messages from ghost
> > messages, and an internal function for initializing a newly created
> > (blank) message as a ghost message.
> > ---
> >  lib/message.cc        | 52 +++++++++++++++++++++++++++++++++++++++++++++++++--
> >  lib/notmuch-private.h |  4 ++++
> >  lib/notmuch.h         |  9 ++++++++-
> >  3 files changed, 62 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/message.cc b/lib/message.cc
> > index 55d2ff6..a7a13cc 100644
> > --- a/lib/message.cc
> > +++ b/lib/message.cc
> > @@ -39,6 +39,9 @@ struct visible _notmuch_message {
> >      notmuch_message_file_t *message_file;
> >      notmuch_message_list_t *replies;
> >      unsigned long flags;
> > +    /* For flags that are initialized on-demand, lazy_flags indicates
> > +     * if each flag has been initialized. */
> > +    unsigned long lazy_flags;
> 
> I wonder if valid_flags might be better here as, as far as I can see,
> the reason for these is so we can invalidate a flag more than an
> optimisation (which is what I thought the lazy implied).

I do think of this as an optimization.  If we were to compute the
value of this flag when a message was created (and keep it
up-to-date), there would be no need for lazy_flags.  But, unlike the
other flags, computing this is expensive.

> >  
> >      Xapian::Document doc;
> >      Xapian::termcount termpos;
> > @@ -99,6 +102,7 @@ _notmuch_message_create_for_document (const void *talloc_owner,
> >  
> >      message->frozen = 0;
> >      message->flags = 0;
> > +    message->lazy_flags = 0;
> >  
> >      /* Each of these will be lazily created as needed. */
> >      message->message_id = NULL;
> > @@ -192,7 +196,7 @@ _notmuch_message_create (const void *talloc_owner,
> >   *
> >   *     There is already a document with message ID 'message_id' in the
> >   *     database. The returned message can be used to query/modify the
> > - *     document.
> > + *     document. The message may be a ghost message.
> >   *
> >   *   NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND:
> >   *
> > @@ -305,6 +309,7 @@ _notmuch_message_ensure_metadata (notmuch_message_t *message)
> >      const char *thread_prefix = _find_prefix ("thread"),
> >  	*tag_prefix = _find_prefix ("tag"),
> >  	*id_prefix = _find_prefix ("id"),
> > +	*type_prefix = _find_prefix ("type"),
> >  	*filename_prefix = _find_prefix ("file-direntry"),
> >  	*replyto_prefix = _find_prefix ("replyto");
> >  
> > @@ -337,10 +342,25 @@ _notmuch_message_ensure_metadata (notmuch_message_t *message)
> >  	message->message_id =
> >  	    _notmuch_message_get_term (message, i, end, id_prefix);
> >  
> > +    /* Get document type */
> > +    assert (strcmp (id_prefix, type_prefix) < 0);
> > +    if (! NOTMUCH_TEST_BIT (message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST)) {
> > +	i.skip_to (type_prefix);
> > +	/* "T" is the prefix "type" fields.  See
> > +	 * BOOLEAN_PREFIX_INTERNAL. */
> > +	if (*i == "Tmail")
> > +	    NOTMUCH_CLEAR_BIT (&message->flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +	else if (*i == "Tghost")
> > +	    NOTMUCH_SET_BIT (&message->flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +	else
> > +	    INTERNAL_ERROR ("Message without type term");
> > +	NOTMUCH_SET_BIT (&message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +    }
> > +
> >      /* Get filename list.  Here we get only the terms.  We lazily
> >       * expand them to full file names when needed in
> >       * _notmuch_message_ensure_filename_list. */
> > -    assert (strcmp (id_prefix, filename_prefix) < 0);
> > +    assert (strcmp (type_prefix, filename_prefix) < 0);
> >      if (!message->filename_term_list && !message->filename_list)
> >  	message->filename_term_list =
> >  	    _notmuch_database_get_terms_with_prefix (message, i, end,
> > @@ -371,6 +391,11 @@ _notmuch_message_invalidate_metadata (notmuch_message_t *message,
> >  	message->tag_list = NULL;
> >      }
> >  
> > +    if (strcmp ("type", prefix_name) == 0) {
> > +	NOTMUCH_CLEAR_BIT (&message->flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +	NOTMUCH_CLEAR_BIT (&message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST);
> > +    }
> > +
> >      if (strcmp ("file-direntry", prefix_name) == 0) {
> >  	talloc_free (message->filename_term_list);
> >  	talloc_free (message->filename_list);
> > @@ -869,6 +894,10 @@ notmuch_bool_t
> >  notmuch_message_get_flag (notmuch_message_t *message,
> >  			  notmuch_message_flag_t flag)
> >  {
> > +    if (flag == NOTMUCH_MESSAGE_FLAG_GHOST &&
> > +	! NOTMUCH_TEST_BIT (message->lazy_flags, flag))
> > +	_notmuch_message_ensure_metadata (message);
> > +
> >      return NOTMUCH_TEST_BIT (message->flags, flag);
> >  }
> >  
> > @@ -880,6 +909,7 @@ notmuch_message_set_flag (notmuch_message_t *message,
> >  	NOTMUCH_SET_BIT (&message->flags, flag);
> >      else
> >  	NOTMUCH_CLEAR_BIT (&message->flags, flag);
> > +    NOTMUCH_SET_BIT (&message->lazy_flags, flag);
> >  }
> >  
> >  time_t
> > @@ -989,6 +1019,24 @@ _notmuch_message_delete (notmuch_message_t *message)
> >      return NOTMUCH_STATUS_SUCCESS;
> >  }
> >  
> > +/* Transform a blank message into a ghost message.  The caller must
> > + * _notmuch_message_sync the message. */
> > +notmuch_private_status_t
> > +_notmuch_message_initialize_ghost (notmuch_message_t *message,
> > +				   const char *thread_id)
> > +{
> > +    notmuch_private_status_t status;
> > +
> > +    status = _notmuch_message_add_term (message, "type", "ghost");
> > +    if (status)
> > +	return status;
> > +    status = _notmuch_message_add_term (message, "thread", thread_id);
> > +    if (status)
> > +	return status;
> > +
> > +    return NOTMUCH_PRIVATE_STATUS_SUCCESS;
> > +}
> > +
> >  /* Ensure that 'message' is not holding any file object open. Future
> >   * calls to various functions will still automatically open the
> >   * message file as needed.
> > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> > index 7250291..2f43c1d 100644
> > --- a/lib/notmuch-private.h
> > +++ b/lib/notmuch-private.h
> > @@ -308,6 +308,10 @@ _notmuch_message_sync (notmuch_message_t *message);
> >  notmuch_status_t
> >  _notmuch_message_delete (notmuch_message_t *message);
> >  
> > +notmuch_private_status_t
> > +_notmuch_message_initialize_ghost (notmuch_message_t *message,
> > +				   const char *thread_id);
> > +
> >  void
> >  _notmuch_message_close (notmuch_message_t *message);
> >  
> > diff --git a/lib/notmuch.h b/lib/notmuch.h
> > index dae0416..92594b9 100644
> > --- a/lib/notmuch.h
> > +++ b/lib/notmuch.h
> > @@ -1221,7 +1221,14 @@ notmuch_message_get_filenames (notmuch_message_t *message);
> >   */
> >  typedef enum _notmuch_message_flag {
> >      NOTMUCH_MESSAGE_FLAG_MATCH,
> > -    NOTMUCH_MESSAGE_FLAG_EXCLUDED
> > +    NOTMUCH_MESSAGE_FLAG_EXCLUDED,
> > +
> > +    /* This message is a "ghost message", meaning it has no filenames
> > +     * or content, but we know it exists because it was referenced by
> > +     * some other message.  A ghost message has only a message ID and
> > +     * thread ID.
> > +     */
> 
> Can I check here: we are not allowing a ghost message to have any tags?

Correct, at least for now.

However, I think it would make *a lot* of sense to be able to pre-seed
ghost messages with tags.  nmbug could use this to avoid races with
receiving messages.  Distributed tag sync could use it similarly.
Insert could use it to eliminate the nasty races between storing the
message, indexing it, and tagging it.  Restore could potentially use
it.  When sending messages, we could pre-seed a sent tag for when the
sent message is re-received (though insert may obviate that).  I'm
sure there are other uses I haven't thought of.

This requires some new APIs, since there's currently no way for a
library user to create a ghost message or get at it to tag it.  It
also slightly complicates notmuch_database_get_all_tags since that
probably shouldn't return tags that are only on ghost messages (I
think if we just collect all the docids in the Tghost posting list and
use that to filter out tag terms that there should be almost no
performance impact of this).  But these are both quite doable things.

A more complicated question is what we want to do with deleted
messages.  Currently we remove them entirely from the database, but we
*could* keep around their tags so if the message reappears (e.g.,
there was a transient problem) we can bring back the tags.  After
thinking about this a great deal, I concluded we should just continue
deleting them from the database (or, at most, strip the message back
down to its thread ID).  If anyone's curious, I can write up my
thoughts on this, but it boils down to complicated the semantics of
initial tagging and dump/restore.

> Best wishes
> 
> Mark
> 
> > +    NOTMUCH_MESSAGE_FLAG_GHOST,
> >  } notmuch_message_flag_t;
> >  
> >  /**
> >
> > _______________________________________________
> > notmuch mailing list
> > notmuch at notmuchmail.org
> > http://notmuchmail.org/mailman/listinfo/notmuch


More information about the notmuch mailing list