[PATCH v3 04/13] lib: Database version 3: Introduce fine-grained "features"
Jani Nikula
jani at nikula.org
Sat Aug 23 09:02:39 PDT 2014
On Fri, 01 Aug 2014, Austin Clements <amdragon at MIT.EDU> wrote:
> Previously, our database schema was versioned by a single number.
> Each database schema change had to occur "atomically" in Notmuch's
> development history: before some commit, Notmuch used version N, after
> that commit, it used version N+1. Hence, each new schema version
> could introduce only one change, the task of developing a schema
> change fell on a single person, and it all had to happen and be
> perfect in a single commit series. This made introducing a new schema
> version hard. We've seen only two schema changes in the history of
> Notmuch.
>
> This commit introduces database schema version 3; hopefully the last
> schema version we'll need for a while. With this version, we switch
> from a single version number to "features": a set of named,
> independent aspects of the database schema.
>
> Features should make backwards compatibility easier. For many things,
> it should be easy to support databases both with and without a
> feature, which will allow us to make upgrades optional and will enable
> "unstable" features that can be developed and tested over time.
>
> Features also make forwards compatibility easier. The features
> recorded in a database include "compatibility flags," which can
> indicate to an older version of Notmuch when it must support a given
> feature to open the database for read or for write. This lets us
> replace the old vague "I don't recognize this version, so something
> might go wrong, but I promise to try my best" warnings upon opening a
> database with an unknown version with precise errors. If a database
> is safe to open for read/write despite unknown features, an older
> version will know that and issue no message at all. If the database
> is not safe to open for read/write because of unknown features, an
> older version will know that, too, and can tell the user exactly which
> required features it lacks support for.
I agree with the change overall; it might be useful to have another set
of eyes on the patch though.
> ---
> lib/database-private.h | 57 ++++++++++++++-
> lib/database.cc | 190 ++++++++++++++++++++++++++++++++++++++++---------
> 2 files changed, 213 insertions(+), 34 deletions(-)
>
> diff --git a/lib/database-private.h b/lib/database-private.h
> index d3e65fd..2ffab33 100644
> --- a/lib/database-private.h
> +++ b/lib/database-private.h
> @@ -41,11 +41,15 @@ struct _notmuch_database {
>
> char *path;
>
> - notmuch_bool_t needs_upgrade;
> notmuch_database_mode_t mode;
> int atomic_nesting;
> Xapian::Database *xapian_db;
>
> + /* Bit mask of features used by this database. Features are
> + * named, independent aspects of the database schema. This is a
> + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */
> + unsigned int features;
> +
> unsigned int last_doc_id;
> uint64_t last_thread_id;
>
> @@ -55,6 +59,57 @@ struct _notmuch_database {
> Xapian::ValueRangeProcessor *date_range_processor;
> };
>
> +/* Bit masks for _notmuch_database::features. */
> +enum {
> + /* If set, file names are stored in "file-direntry" terms. If
> + * unset, file names are stored in document data.
> + *
> + * Introduced: version 1. Implementation support: both for read;
> + * required for write. */
Maybe I'm dense, but the implementation support comments could be
clearer.
> + NOTMUCH_FEATURE_FILE_TERMS = 1 << 0,
> +
> + /* If set, directory timestamps are stored in documents with
> + * XDIRECTORY terms and relative paths. If unset, directory
> + * timestamps are stored in documents with XTIMESTAMP terms and
> + * absolute paths.
> + *
> + * Introduced: version 1. Implementation support: required. */
> + NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 << 1,
> +
> + /* If set, the from, subject, and message-id headers are stored in
> + * message document values. If unset, message documents *may*
> + * have these values, but if the value is empty, it must be
> + * retrieved from the message file.
> + *
> + * Introduced: optional in version 1, required as of version 3.
> + * Implementation support: both.
> + */
> + NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 << 2,
> +
> + /* If set, folder terms are boolean and path terms exist. If
> + * unset, folder terms are probabilistic and stemmed and path
> + * terms do not exist.
> + *
> + * Introduced: version 2. Implementation support: required. */
> + NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3,
> +};
> +
> +/* Prior to database version 3, features were implied by the database
> + * version number, so hard-code them for earlier versions. */
> +#define NOTMUCH_FEATURES_V0 (0)
> +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \
> + NOTMUCH_FEATURE_DIRECTORY_DOCS)
> +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER)
> +
> +/* Current database features. If any of these are missing from a
> + * database, request an upgrade.
> + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because
> + * upgrade doesn't currently introduce the feature (though brand new
> + * databases will have it). */
> +#define NOTMUCH_FEATURES_CURRENT \
> + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
> + NOTMUCH_FEATURE_BOOL_FOLDER)
> +
> /* Return the list of terms from the given iterator matching a prefix.
> * The prefix will be stripped from the strings in the returned list.
> * The list will be allocated using ctx as the talloc context.
> diff --git a/lib/database.cc b/lib/database.cc
> index 45c4260..29a56db 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -20,6 +20,7 @@
>
> #include "database-private.h"
> #include "parse-time-vrp.h"
> +#include "string-util.h"
>
> #include <iostream>
>
> @@ -42,7 +43,7 @@ typedef struct {
> const char *prefix;
> } prefix_t;
>
> -#define NOTMUCH_DATABASE_VERSION 2
> +#define NOTMUCH_DATABASE_VERSION 3
>
> #define STRINGIFY(s) _SUB_STRINGIFY(s)
> #define _SUB_STRINGIFY(s) #s
> @@ -151,6 +152,17 @@ typedef struct {
> * changes are made to the database (such as by
> * indexing new fields).
> *
> + * features The set of features supported by this
> + * database. This consists of a set of
> + * '\n'-separated lines, where each is a feature
> + * name, a '\t', and compatibility flags. If the
> + * compatibility flags contain 'w', then the
> + * opener must support this feature to safely
> + * write this database. If the compatibility
> + * flags contain 'r', then the opener must
> + * support this feature to read this database.
> + * Introduced in database version 3.
> + *
> * last_thread_id The last thread ID generated. This is stored
> * as a 16-byte hexadecimal ASCII representation
> * of a 64-bit unsigned integer. The first ID
> @@ -251,6 +263,25 @@ _find_prefix (const char *name)
> return "";
> }
>
> +static const struct
> +{
Brace should be at the end of the previous line.
> + /* NOTMUCH_FEATURE_* value. */
> + unsigned int value;
> + /* Feature name as it appears in the database. This name should
> + * be appropriate for displaying to the user if an older version
> + * of notmuch doesn't support this feature. */
> + const char *name;
> + /* Compatibility flags when this feature is declared. */
> + const char *flags;
> +} feature_names[] = {
> + {NOTMUCH_FEATURE_FILE_TERMS, "multiple paths per message", "rw"},
> + {NOTMUCH_FEATURE_DIRECTORY_DOCS, "relative directory paths", "rw"},
> + /* Header values are not required for reading a database because a
> + * reader can just refer to the message file. */
> + {NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES, "from/subject/message-ID in database", "w"},
> + {NOTMUCH_FEATURE_BOOL_FOLDER, "exact folder:/path: search", "rw"},
Spaces missing after the opening braces and before the closing braces.
> +};
> +
> const char *
> notmuch_status_to_string (notmuch_status_t status)
> {
> @@ -591,6 +622,11 @@ notmuch_database_create (const char *path, notmuch_database_t **database)
> ¬much);
> if (status)
> goto DONE;
> +
> + /* Upgrade doesn't add this feature to existing databases, but new
> + * databases have it. */
> + notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES;
> +
> status = notmuch_database_upgrade (notmuch, NULL, NULL);
> if (status) {
> notmuch_database_close(notmuch);
> @@ -619,6 +655,80 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch)
> return NOTMUCH_STATUS_SUCCESS;
> }
>
> +/* Parse a database features string from the given database version.
> + *
> + * For version < 3, this ignores the features string and returns a
> + * hard-coded set of features.
> + *
> + * If there are unrecognized features that are required to open the
> + * database in mode (which should be 'r' or 'w'), return a
> + * comma-separated list of unrecognized but required features in
> + * *incompat_out, which will be allocated from ctx.
Please describe the actual return value.
> + */
> +static unsigned int
> +_parse_features (const void *ctx, const char *features, unsigned int version,
> + char mode, char **incompat_out)
> +{
> + unsigned int res = 0, namelen, i;
> + size_t llen = 0;
> + const char *flags;
> +
> + /* Prior to database version 3, features were implied by the
> + * version number. */
> + if (version == 0)
> + return NOTMUCH_FEATURES_V0;
> + else if (version == 1)
> + return NOTMUCH_FEATURES_V1;
> + else if (version == 2)
> + return NOTMUCH_FEATURES_V2;
> +
> + /* Parse the features string */
> + while ((features = strtok_len_c (features + llen, "\n", &llen)) != NULL) {
> + flags = strchr (features, '\t');
> + if (! flags || flags > features + llen)
> + continue;
> + namelen = flags - features;
> +
> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) {
> + if (strlen (feature_names[i].name) == namelen &&
> + strncmp (feature_names[i].name, features, namelen) == 0) {
> + res |= feature_names[i].value;
> + break;
> + }
> + }
> +
> + if (i == ARRAY_SIZE (feature_names)) {
> + /* Unrecognized feature */
> + const char *have = strchr (flags, mode);
> + if (have && have < features + llen) {
> + /* This feature is required to access this database in
> + * 'mode', but we don't understand it. */
> + if (! *incompat_out)
> + *incompat_out = talloc_strdup (ctx, "");
> + *incompat_out = talloc_asprintf_append_buffer (
> + *incompat_out, "%s%.*s", **incompat_out ? ", " : "",
> + namelen, features);
Do we intend the lib user to be able to parse the features? Perhaps we
should use '\n' as separator here too? (Although looks like *currently*
this is only for printing the message from within the lib.)
> + }
> + }
> + }
> +
> + return res;
> +}
> +
> +static char *
> +_print_features (const void *ctx, unsigned int features)
> +{
> + unsigned int i;
> + char *res = talloc_strdup (ctx, "");
> +
> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i)
> + if (features & feature_names[i].value)
> + res = talloc_asprintf_append_buffer (
> + res, "%s\t%s\n", feature_names[i].name, feature_names[i].flags);
> +
> + return res;
> +}
> +
> notmuch_status_t
> notmuch_database_open (const char *path,
> notmuch_database_mode_t mode,
> @@ -627,7 +737,7 @@ notmuch_database_open (const char *path,
> notmuch_status_t status = NOTMUCH_STATUS_SUCCESS;
> void *local = talloc_new (NULL);
> notmuch_database_t *notmuch = NULL;
> - char *notmuch_path, *xapian_path;
> + char *notmuch_path, *xapian_path, *incompat_features;
> struct stat st;
> int err;
> unsigned int i, version;
> @@ -677,7 +787,6 @@ notmuch_database_open (const char *path,
> if (notmuch->path[strlen (notmuch->path) - 1] == '/')
> notmuch->path[strlen (notmuch->path) - 1] = '\0';
>
> - notmuch->needs_upgrade = FALSE;
> notmuch->mode = mode;
> notmuch->atomic_nesting = 0;
> try {
> @@ -686,37 +795,44 @@ notmuch_database_open (const char *path,
> if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) {
> notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path,
> Xapian::DB_CREATE_OR_OPEN);
> - version = notmuch_database_get_version (notmuch);
> -
> - if (version > NOTMUCH_DATABASE_VERSION) {
> - fprintf (stderr,
> - "Error: Notmuch database at %s\n"
> - " has a newer database format version (%u) than supported by this\n"
> - " version of notmuch (%u). Refusing to open this database in\n"
> - " read-write mode.\n",
> - notmuch_path, version, NOTMUCH_DATABASE_VERSION);
> - notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY;
> - notmuch_database_destroy (notmuch);
> - notmuch = NULL;
> - status = NOTMUCH_STATUS_FILE_ERROR;
> - goto DONE;
> - }
> -
> - if (version < NOTMUCH_DATABASE_VERSION)
> - notmuch->needs_upgrade = TRUE;
> } else {
> notmuch->xapian_db = new Xapian::Database (xapian_path);
> - version = notmuch_database_get_version (notmuch);
> - if (version > NOTMUCH_DATABASE_VERSION)
> - {
> - fprintf (stderr,
> - "Warning: Notmuch database at %s\n"
> - " has a newer database format version (%u) than supported by this\n"
> - " version of notmuch (%u). Some operations may behave incorrectly,\n"
> - " (but the database will not be harmed since it is being opened\n"
> - " in read-only mode).\n",
> - notmuch_path, version, NOTMUCH_DATABASE_VERSION);
> - }
> + }
> +
> + /* Check version. As of database version 3, we represent
> + * changes in terms of features, so assume a version bump
> + * means a dramatically incompatible change. */
> + version = notmuch_database_get_version (notmuch);
> + if (version > NOTMUCH_DATABASE_VERSION) {
> + fprintf (stderr,
> + "Error: Notmuch database at %s\n"
> + " has a newer database format version (%u) than supported by this\n"
> + " version of notmuch (%u).\n",
> + notmuch_path, version, NOTMUCH_DATABASE_VERSION);
> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY;
> + notmuch_database_destroy (notmuch);
> + notmuch = NULL;
> + status = NOTMUCH_STATUS_FILE_ERROR;
> + goto DONE;
> + }
> +
> + /* Check features. */
> + incompat_features = NULL;
> + notmuch->features = _parse_features (
> + local, notmuch->xapian_db->get_metadata ("features").c_str (),
> + version, mode == NOTMUCH_DATABASE_MODE_READ_WRITE ? 'w' : 'r',
Makes me think _parse_features could use notmuch_database_mode_t mode
instead of char mode. *shrug*.
> + &incompat_features);
> + if (incompat_features) {
> + fprintf (stderr,
> + "Error: Notmuch database at %s\n"
> + " requires features (%s)\n"
> + " not supported by this version of notmuch.\n",
> + notmuch_path, incompat_features);
> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY;
> + notmuch_database_destroy (notmuch);
> + notmuch = NULL;
> + status = NOTMUCH_STATUS_FILE_ERROR;
> + goto DONE;
> }
>
> notmuch->last_doc_id = notmuch->xapian_db->get_lastdocid ();
> @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch)
> notmuch_bool_t
> notmuch_database_needs_upgrade (notmuch_database_t *notmuch)
> {
> - return notmuch->needs_upgrade;
> + return notmuch->mode == NOTMUCH_DATABASE_MODE_READ_WRITE &&
> + (NOTMUCH_FEATURES_CURRENT & ~notmuch->features);
> }
Am I correct that this does not lead to a database upgrade from v2 to v3
until we actually change the features?
Aside, why don't we return a suitable status code from
notmuch_database_open when an upgrade is required?
> static volatile sig_atomic_t do_progress_notify = 0;
> @@ -1077,6 +1194,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> double progress),
> void *closure)
> {
> + void *local = talloc_new (NULL);
> Xapian::WritableDatabase *db;
> struct sigaction action;
> struct itimerval timerval;
> @@ -1114,6 +1232,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> timer_is_active = TRUE;
> }
>
> + /* Set the target features so we write out changes in the desired
> + * format. */
> + notmuch->features |= NOTMUCH_FEATURES_CURRENT;
> +
> /* Before version 1, each message document had its filename in the
> * data field. Copy that into the new format by calling
> * notmuch_message_add_filename.
> @@ -1226,6 +1348,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> notmuch_query_destroy (query);
> }
>
> + db->set_metadata ("features", _print_features (local, notmuch->features));
> db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
> db->flush ();
>
> @@ -1302,6 +1425,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> sigaction (SIGALRM, &action, NULL);
> }
>
> + talloc_free (local);
> return NOTMUCH_STATUS_SUCCESS;
> }
>
> --
> 2.0.0
>
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
More information about the notmuch
mailing list