WIP: add metadata to dump output

Tomi Ollila tomi.ollila at iki.fi
Sun Jan 10 06:36:59 PST 2016


On Sun, Jan 10 2016, David Bremner <david at tethera.net> wrote:

> It seems (at least to me) that xapian metadata is the right way store
> certain configuration data, including tag aliases [1] and perhaps some
> non-CLI specific configuration. On the other hand we don't want to
> have things lost if we dump and restore a database. Hence this series,
> which is a start at dumping and restore such config.
>
> The main idea here is that various classes of metadata can be defined
> by using prefixes, in exactly the same way as tags are defined for
> documents. This will hopefully help prevent e.g. config from stomping
> on tag aliases.
>
> The first 6 patches impliment iterators for simple "queries" on
> metadata. They are probably split a bit fine, but that's the way I
> developed them.
>
> The last 3 impliment the printing of metadata in dump output. In order
> to be upwardly compatible, it uses the old dodge of hiding things in
> comments. In fact the comment syntax (# in first column) was never
> well documented; this does mean that the notmuch dump output can be
> tested without breaking the current restore tests. I threw an @ in to
> help autodetection of formats; obviously this is not foolproof. On the
> other hand, I don't know how much people currently rely on comments in
> dump files, since notmuch doesn't generate them.
>
> There's lots of bikes to shed here. Probably the most important bits
> are the library API, the dump output format, and of course the ever
> tricky command line argument names.

Generally this series looks pretty good. IMO this could have gone with
way less separate patches -- It would have made the review easier,
now I had to go back to previous mails just to look context. But,
anyone who disagrees w/ this make David know (in any appropriate 
channel so my opinion does not get too emphasized ;D)

The first thing that came into my mind was this naming of
*_FIRST_CLASS and *_LAST_CLASS in enum values. the naming
is inconsistent in sense that first is first, but last is last + 1.
Unfortunately there is nothing we can do with that as these *_LAST_*
are used in other enums too so we just have to live with it. 

In last in this series there is
+typedef enum dump_includes {
+    DUMP_INCLUDE_TAGS=1,
+    DUMP_INCLUDE_METADATA=2,
+} dump_include_t;

-- spacing around ' = ' missing -- I did not see other whitespace errors
(not that there might not be those, though, as we know David ;)

One bug I found:

+    for (mclass = NOTMUCH_METADATA_FIRST_CLASS; mclass < NOTMUCH_METADATA_LAST_CLASS; mclass++) {
+	status = notmuch_database_get_all_metadata (notmuch, NOTMUCH_METADATA_CONFIG, &meta);

(mclass should be there). Currently as there is only that one in the enum
there is no problem -- also for the same reason current test can not 
notice this. If this were not fixed, this would be noticed in the future
by that particular test - unless it is changed erronelously ;)


Anyway, good stuff in general...


Tomi

> Getting the memory ownership semantics is tricky, especially with the
> mix of C++ objects and talloc. So I'd appreciate a critical eye on
> those bits of metadata.cc.

uh puh -- maybe I look that again (hmm, have to apply the patch series as
all of the metadata.cc is not in one patch ;/


More information about the notmuch mailing list