[PATCH 05/14] lib: Database version 3: Introduce fine-grained "features"
Mark Walters
markwalters1009 at gmail.com
Sun Jul 27 02:53:25 PDT 2014
On Sun, 27 Jul 2014, Austin Clements <amdragon at MIT.EDU> wrote:
> Previously, our database schema was versioned by a single number.
> Each database schema change had to occur "atomically" in Notmuch's
> development history: before some commit, Notmuch used version N, after
> that commit, it used version N+1. Hence, each new schema version
> could introduce only one change, the task of developing a schema
> change fell on a single person, and it all had to happen and be
> perfect in a single commit series. This made introducing a new schema
> version hard. We've seen only two schema changes in the history of
> Notmuch.
I like this series as a whole and I am happy with all the patches that I
haven't sent comments on. (Note, however, that I am not very familiar
with the database code.)
There is one thing which confuses me in this patch:
>
> This commit introduces database schema version 3; hopefully the last
> schema version we'll need for a while. With this version, we switch
> from a single version number to "features": a set of named,
> independent aspects of the database schema.
>
> Features should make backwards compatibility easier. For many things,
> it should be easy to support databases both with and without a
> feature, which will allow us to make upgrades optional and will enable
> "unstable" features that can be developed and tested over time.
>
> Features also make forwards compatibility easier. The features
> recorded in a database include "compatibility flags," which can
> indicate to an older version of Notmuch when it must support a given
> feature to open the database for read or for write. This lets us
> replace the old vague "I don't recognize this version, so something
> might go wrong, but I promise to try my best" warnings upon opening a
> database with an unknown version with precise errors. If a database
> is safe to open for read/write despite unknown features, an older
> version will know that and issue no message at all. If the database
> is not safe to open for read/write because of unknown features, an
> older version will know that, too, and can tell the user exactly which
> required features it lacks support for.
> ---
> lib/database-private.h | 56 +++++++++++++++
> lib/database.cc | 189 +++++++++++++++++++++++++++++++++++++++++--------
> 2 files changed, 214 insertions(+), 31 deletions(-)
>
> diff --git a/lib/database-private.h b/lib/database-private.h
> index d3e65fd..323b9fe 100644
> --- a/lib/database-private.h
> +++ b/lib/database-private.h
> @@ -46,6 +46,11 @@ struct _notmuch_database {
> int atomic_nesting;
> Xapian::Database *xapian_db;
>
> + /* Bit mask of features used by this database. Features are
> + * named, independent aspects of the database schema. This is a
> + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */
> + unsigned int features;
> +
> unsigned int last_doc_id;
> uint64_t last_thread_id;
>
> @@ -55,6 +60,57 @@ struct _notmuch_database {
> Xapian::ValueRangeProcessor *date_range_processor;
> };
>
> +/* Bit masks for _notmuch_database::features. */
> +enum {
> + /* If set, file names are stored in "file-direntry" terms. If
> + * unset, file names are stored in document data.
> + *
> + * Introduced: version 1. Implementation support: both for read;
> + * required for write. */
> + NOTMUCH_FEATURE_FILE_TERMS = 1 << 0,
> +
> + /* If set, directory timestamps are stored in documents with
> + * XDIRECTORY terms and relative paths. If unset, directory
> + * timestamps are stored in documents with XTIMESTAMP terms and
> + * absolute paths.
> + *
> + * Introduced: version 1. Implementation support: required. */
> + NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 << 1,
> +
> + /* If set, the from, subject, and message-id headers are stored in
> + * message document values. If unset, message documents *may*
> + * have these values, but if the value is empty, it must be
> + * retrieved from the message file.
> + *
> + * Introduced: optional in version 1, required as of version 3.
> + * Implementation support: both.
> + */
> + NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 << 2,
> +
> + /* If set, folder terms are boolean and path terms exist. If
> + * unset, folder terms are probabilistic and stemmed and path
> + * terms do not exist.
> + *
> + * Introduced: version 2. Implementation support: required. */
> + NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3,
> +};
> +
> +/* Prior to database version 3, features were implied by the database
> + * version number, so hard-code them for earlier versions. */
> +#define NOTMUCH_FEATURES_V0 (0)
> +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \
> + NOTMUCH_FEATURE_DIRECTORY_DOCS)
> +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER)
> +
> +/* Current database features. If any of these are missing from a
> + * database, request an upgrade.
> + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because
> + * upgrade doesn't currently introduce the feature (though brand new
> + * databases will have it). */
> +#define NOTMUCH_FEATURES_CURRENT \
> + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
> + NOTMUCH_FEATURE_BOOL_FOLDER)
> +
> /* Return the list of terms from the given iterator matching a prefix.
> * The prefix will be stripped from the strings in the returned list.
> * The list will be allocated using ctx as the talloc context.
> diff --git a/lib/database.cc b/lib/database.cc
> index 45c4260..03eef3e 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -20,6 +20,7 @@
>
> #include "database-private.h"
> #include "parse-time-vrp.h"
> +#include "string-util.h"
>
> #include <iostream>
>
> @@ -42,7 +43,7 @@ typedef struct {
> const char *prefix;
> } prefix_t;
>
> -#define NOTMUCH_DATABASE_VERSION 2
> +#define NOTMUCH_DATABASE_VERSION 3
>
> #define STRINGIFY(s) _SUB_STRINGIFY(s)
> #define _SUB_STRINGIFY(s) #s
> @@ -151,6 +152,17 @@ typedef struct {
> * changes are made to the database (such as by
> * indexing new fields).
> *
> + * features The set of features supported by this
> + * database. This consists of a set of
> + * '\n'-separated lines, where each is a feature
> + * name, a '\t', and compatibility flags. If the
> + * compatibility flags contain 'w', then the
> + * opener must support this feature to safely
> + * write this database. If the compatibility
> + * flags contain 'r', then the opener must
> + * support this feature to read this database.
> + * Introduced in database version 3.
> + *
> * last_thread_id The last thread ID generated. This is stored
> * as a 16-byte hexadecimal ASCII representation
> * of a 64-bit unsigned integer. The first ID
> @@ -226,6 +238,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= {
> { "subject", "XSUBJECT"},
> };
>
> +
> const char *
> _find_prefix (const char *name)
> {
> @@ -251,6 +264,25 @@ _find_prefix (const char *name)
> return "";
> }
>
> +static const struct
> +{
> + /* NOTMUCH_FEATURE_* value. */
> + unsigned int value;
> + /* Feature name as it appears in the database. This name should
> + * be appropriate for displaying to the user if an older version
> + * of notmuch doesn't support this feature. */
> + const char *name;
> + /* Compatibility flags when this feature is declared. */
> + const char *flags;
> +} feature_names[] = {
> + {NOTMUCH_FEATURE_FILE_TERMS, "file terms", "rw"},
> + {NOTMUCH_FEATURE_DIRECTORY_DOCS, "directory documents", "rw"},
> + /* Header values are not required for reading a database because a
> + * reader can just refer to the message file. */
> + {NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES, "from/subject/ID values", "w"},
> + {NOTMUCH_FEATURE_BOOL_FOLDER, "boolean folder terms", "rw"},
> +};
> +
> const char *
> notmuch_status_to_string (notmuch_status_t status)
> {
> @@ -591,6 +623,11 @@ notmuch_database_create (const char *path, notmuch_database_t **database)
> ¬much);
> if (status)
> goto DONE;
> +
> + /* Upgrade doesn't add this feature to existing databases, but new
> + * databases have it. */
> + notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES;
> +
Does this mean that if the user wants this feature he has to dump,
delete the existing database, and restore? Is that just because no-one
has implemented upgrade for this feature or is it "hard" in some sense?
A possibly related question: is there likely to be a case where the user
does not want to add/upgrade some feature? Would it be worth having an
explicit upgrade command which let the user choose which features to
upgrade? This is orthogonal to this series: I am just trying to get a
feel for how it will, or could, be used.
Best wishes
Mark
> status = notmuch_database_upgrade (notmuch, NULL, NULL);
> if (status) {
> notmuch_database_close(notmuch);
> @@ -619,6 +656,80 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch)
> return NOTMUCH_STATUS_SUCCESS;
> }
>
> +/* Parse a database features string from the given database version.
> + *
> + * For version < 3, this ignores the features string and returns a
> + * hard-coded set of features.
> + *
> + * If there are unrecognized features that are required to open the
> + * database in mode (which should be 'r' or 'w'), return a
> + * comma-separated list of unrecognized but required features in
> + * *incompat_out, which will be allocated from ctx.
> + */
> +static unsigned int
> +_parse_features (const void *ctx, const char *features, unsigned int version,
> + char mode, char **incompat_out)
> +{
> + unsigned int res = 0, namelen, i;
> + size_t llen = 0;
> + const char *flags;
> +
> + /* Prior to database version 3, features were implied by the
> + * version number. */
> + if (version == 0)
> + return NOTMUCH_FEATURES_V0;
> + else if (version == 1)
> + return NOTMUCH_FEATURES_V1;
> + else if (version == 2)
> + return NOTMUCH_FEATURES_V2;
> +
> + /* Parse the features string */
> + while ((features = strtok_len_c (features + llen, "\n", &llen)) != NULL) {
> + flags = strchr (features, '\t');
> + if (! flags || flags > features + llen)
> + continue;
> + namelen = flags - features;
> +
> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) {
> + if (strlen (feature_names[i].name) == namelen &&
> + strncmp (feature_names[i].name, features, namelen) == 0) {
> + res |= feature_names[i].value;
> + break;
> + }
> + }
> +
> + if (i == ARRAY_SIZE (feature_names)) {
> + /* Unrecognized feature */
> + const char *have = strchr (flags, mode);
> + if (have && have < features + llen) {
> + /* This feature is required to access this database in
> + * 'mode', but we don't understand it. */
> + if (! *incompat_out)
> + *incompat_out = talloc_strdup (ctx, "");
> + *incompat_out = talloc_asprintf_append_buffer (
> + *incompat_out, "%s%.*s", **incompat_out ? ", " : "",
> + namelen, features);
> + }
> + }
> + }
> +
> + return res;
> +}
> +
> +static char *
> +_print_features (const void *ctx, unsigned int features)
> +{
> + unsigned int i;
> + char *res = talloc_strdup (ctx, "");
> +
> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i)
> + if (features & feature_names[i].value)
> + res = talloc_asprintf_append_buffer (
> + res, "%s\t%s\n", feature_names[i].name, feature_names[i].flags);
> +
> + return res;
> +}
> +
> notmuch_status_t
> notmuch_database_open (const char *path,
> notmuch_database_mode_t mode,
> @@ -627,7 +738,7 @@ notmuch_database_open (const char *path,
> notmuch_status_t status = NOTMUCH_STATUS_SUCCESS;
> void *local = talloc_new (NULL);
> notmuch_database_t *notmuch = NULL;
> - char *notmuch_path, *xapian_path;
> + char *notmuch_path, *xapian_path, *incompat_features;
> struct stat st;
> int err;
> unsigned int i, version;
> @@ -686,39 +797,51 @@ notmuch_database_open (const char *path,
> if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) {
> notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path,
> Xapian::DB_CREATE_OR_OPEN);
> - version = notmuch_database_get_version (notmuch);
> -
> - if (version > NOTMUCH_DATABASE_VERSION) {
> - fprintf (stderr,
> - "Error: Notmuch database at %s\n"
> - " has a newer database format version (%u) than supported by this\n"
> - " version of notmuch (%u). Refusing to open this database in\n"
> - " read-write mode.\n",
> - notmuch_path, version, NOTMUCH_DATABASE_VERSION);
> - notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY;
> - notmuch_database_destroy (notmuch);
> - notmuch = NULL;
> - status = NOTMUCH_STATUS_FILE_ERROR;
> - goto DONE;
> - }
> -
> - if (version < NOTMUCH_DATABASE_VERSION)
> - notmuch->needs_upgrade = TRUE;
> } else {
> notmuch->xapian_db = new Xapian::Database (xapian_path);
> - version = notmuch_database_get_version (notmuch);
> - if (version > NOTMUCH_DATABASE_VERSION)
> - {
> - fprintf (stderr,
> - "Warning: Notmuch database at %s\n"
> - " has a newer database format version (%u) than supported by this\n"
> - " version of notmuch (%u). Some operations may behave incorrectly,\n"
> - " (but the database will not be harmed since it is being opened\n"
> - " in read-only mode).\n",
> - notmuch_path, version, NOTMUCH_DATABASE_VERSION);
> - }
> }
>
> + /* Check version. As of database version 3, we represent
> + * changes in terms of features, so assume a version bump
> + * means a dramatically incompatible change. */
> + version = notmuch_database_get_version (notmuch);
> + if (version > NOTMUCH_DATABASE_VERSION) {
> + fprintf (stderr,
> + "Error: Notmuch database at %s\n"
> + " has a newer database format version (%u) than supported by this\n"
> + " version of notmuch (%u).\n",
> + notmuch_path, version, NOTMUCH_DATABASE_VERSION);
> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY;
> + notmuch_database_destroy (notmuch);
> + notmuch = NULL;
> + status = NOTMUCH_STATUS_FILE_ERROR;
> + goto DONE;
> + }
> +
> + /* Check features. */
> + incompat_features = NULL;
> + notmuch->features = _parse_features (
> + local, notmuch->xapian_db->get_metadata ("features").c_str (),
> + version, mode == NOTMUCH_DATABASE_MODE_READ_WRITE ? 'w' : 'r',
> + &incompat_features);
> + if (incompat_features) {
> + fprintf (stderr,
> + "Error: Notmuch database at %s\n"
> + " requires features (%s)\n"
> + " not supported by this version of notmuch.\n",
> + notmuch_path, incompat_features);
> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY;
> + notmuch_database_destroy (notmuch);
> + notmuch = NULL;
> + status = NOTMUCH_STATUS_FILE_ERROR;
> + goto DONE;
> + }
> +
> + /* Do we want an upgrade? */
> + if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE &&
> + NOTMUCH_FEATURES_CURRENT & ~notmuch->features)
> + notmuch->needs_upgrade = TRUE;
> +
> notmuch->last_doc_id = notmuch->xapian_db->get_lastdocid ();
> last_thread_id = notmuch->xapian_db->get_metadata ("last_thread_id");
> if (last_thread_id.empty ()) {
> @@ -1077,6 +1200,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> double progress),
> void *closure)
> {
> + void *local = talloc_new (NULL);
> Xapian::WritableDatabase *db;
> struct sigaction action;
> struct itimerval timerval;
> @@ -1226,6 +1350,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> notmuch_query_destroy (query);
> }
>
> + notmuch->features |= NOTMUCH_FEATURES_CURRENT;
> + db->set_metadata ("features", _print_features (local, notmuch->features));
> db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
> db->flush ();
>
> @@ -1302,6 +1428,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> sigaction (SIGALRM, &action, NULL);
> }
>
> + talloc_free (local);
> return NOTMUCH_STATUS_SUCCESS;
> }
>
> --
> 2.0.0
>
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
More information about the notmuch
mailing list