[PATCH 08/14] lib: Simplify upgrade code using a transaction

Austin Clements amdragon at MIT.EDU
Sun Jul 27 09:42:24 PDT 2014


Quoth Mark Walters on Jul 27 at 10:35 am:
> 
> Hi
> 
> On Sun, 27 Jul 2014, Austin Clements <amdragon at MIT.EDU> wrote:
> > Previously, the upgrade was organized as two passes -- an upgrade
> > pass, and a separate cleanup pass -- so the database was always in a
> > valid state.  This change substantially simplifies this code by
> > performing the upgrade in a transaction and combining both passes in
> > to one.  This 1) eliminates a lot of duplicate code between the
> > passes, 2) speeds up the upgrade process, 3) makes progress reporting
> > more accurate, 4) eliminates the potential for stale data if the
> > upgrade is interrupted during the cleanup pass, and 5) makes it easier
> > to reason about the safety of the upgrade code.
> 
> I like this but I wonder if it has a side effect: I think with the
> current code the user can interrupt the upgrade (ctrl-c) and continue
> roughly where it left off. This looks like it means the whole upgrade
> needs to be done in one go. Will this be a problem on large mail stores
> (eg rlb with over 1M messages)?
> 
> I am not sure what could be done during the interrupted upgrade before
> so maybe this is not a problem.

I haven't tested this hypothesis, but I don't think a partially
completed upgrade would actually help upon restarting the upgrade.
Since the old upgrade process couldn't safely remove terms/data until
the end of the upgrade, if it were interrupted, the next upgrade would
start right back at the beginning and do everything over again.

Also, since the old upgrade code had to update the version number
before removing old terms/data, if it was interrupted during the
cleanup process the database would be left with cruft that would
*never* be removed.

With features we actually have a better chance of making partially
completed upgrades useful: we could commit after each individual
feature gets upgraded.  Of course, that only helps when upgrade has
multiple new features to upgrade to, so it may or may not be useful in
practice depending on how quickly we add new features.

> Best wishes
> 
> Mark
> 
> 
> > ---
> >  lib/database.cc | 67 ++++++---------------------------------------------------
> >  1 file changed, 7 insertions(+), 60 deletions(-)
> >
> > diff --git a/lib/database.cc b/lib/database.cc
> > index 03eef3e..0be7180 100644
> > --- a/lib/database.cc
> > +++ b/lib/database.cc
> > @@ -1238,6 +1238,9 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> >  	timer_is_active = TRUE;
> >      }
> >  
> > +    /* Perform the upgrade in a transaction. */
> > +    db->begin_transaction (true);
> > +
> >      /* Before version 1, each message document had its filename in the
> >       * data field. Copy that into the new format by calling
> >       * notmuch_message_add_filename.
> > @@ -1265,6 +1268,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> >  	    filename = _notmuch_message_talloc_copy_data (message);
> >  	    if (filename && *filename != '\0') {
> >  		_notmuch_message_add_filename (message, filename);
> > +		_notmuch_message_clear_data (message);
> >  		_notmuch_message_sync (message);
> >  	    }
> >  	    talloc_free (filename);
> > @@ -1312,6 +1316,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> >  						       NOTMUCH_FIND_CREATE, &status);
> >  		notmuch_directory_set_mtime (directory, mtime);
> >  		notmuch_directory_destroy (directory);
> > +
> > +		db->delete_document (*p);
> >  	    }
> >  	}
> >      }
> > @@ -1353,67 +1359,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> >      notmuch->features |= NOTMUCH_FEATURES_CURRENT;
> >      db->set_metadata ("features", _print_features (local, notmuch->features));
> >      db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
> > -    db->flush ();
> > -
> > -    /* Now that the upgrade is complete we can remove the old data
> > -     * and documents that are no longer needed. */
> > -    if (version < 1) {
> > -	notmuch_query_t *query = notmuch_query_create (notmuch, "");
> > -	notmuch_messages_t *messages;
> > -	notmuch_message_t *message;
> > -	char *filename;
> > -
> > -	for (messages = notmuch_query_search_messages (query);
> > -	     notmuch_messages_valid (messages);
> > -	     notmuch_messages_move_to_next (messages))
> > -	{
> > -	    if (do_progress_notify) {
> > -		progress_notify (closure, (double) count / total);
> > -		do_progress_notify = 0;
> > -	    }
> > -
> > -	    message = notmuch_messages_get (messages);
> > -
> > -	    filename = _notmuch_message_talloc_copy_data (message);
> > -	    if (filename && *filename != '\0') {
> > -		_notmuch_message_clear_data (message);
> > -		_notmuch_message_sync (message);
> > -	    }
> > -	    talloc_free (filename);
> > -
> > -	    notmuch_message_destroy (message);
> > -	}
> >  
> > -	notmuch_query_destroy (query);
> > -    }
> > -
> > -    if (version < 1) {
> > -	Xapian::TermIterator t, t_end;
> > -
> > -	t_end = notmuch->xapian_db->allterms_end ("XTIMESTAMP");
> > -
> > -	for (t = notmuch->xapian_db->allterms_begin ("XTIMESTAMP");
> > -	     t != t_end;
> > -	     t++)
> > -	{
> > -	    Xapian::PostingIterator p, p_end;
> > -	    std::string term = *t;
> > -
> > -	    p_end = notmuch->xapian_db->postlist_end (term);
> > -
> > -	    for (p = notmuch->xapian_db->postlist_begin (term);
> > -		 p != p_end;
> > -		 p++)
> > -	    {
> > -		if (do_progress_notify) {
> > -		    progress_notify (closure, (double) count / total);
> > -		    do_progress_notify = 0;
> > -		}
> > -
> > -		db->delete_document (*p);
> > -	    }
> > -	}
> > -    }
> > +    db->commit_transaction ();
> >  
> >      if (timer_is_active) {
> >  	/* Now stop the timer. */
> >


More information about the notmuch mailing list