More ideas about logging.

David Bremner bremner at debian.org
Tue Dec 20 12:25:53 PST 2011


On Sun, 18 Dec 2011 13:22:20 -0700, Tom Prince <tom.prince at ualberta.net> wrote:
> On Sun, 18 Dec 2011 14:34:00 -0400, David Bremner <bremner at debian.org> wrote:
> > The more worrying part is disk usage; the tag tree for 200k messages
> > uses 400k inodes, and 836M of apparent disk usage (according to du) the
> > same tags in "sup" format take 11M.  Maybe this could be usefull if
> > combined with some scheme to only dump tags not covered by maildir (for
> > those using maildir flag synching already)
> 
> Well, it would seem natural to re-use the nmbug logic here, and just use
> a bare git repo for this. One would need a way to sync and merge the
> tag-tree automatically anyway. I admit I haven't tried nmbug yet, but it
> seems that nmbug, switched from sync just notmuch:: to syncing
> everything but notmuch:: would be a sensible way to sync tags?

I was mainly interested in if some guarantee of atomicity could be given
in a simple way.  The git update-index approach doesn't really make
those kind of guaranteees..  Probably this is tolerable for a human
initiated "dump" process; not so much for other uses.  Furthermore much
of the motivation for both mtimes and logging is to make incremental
dumping possible in order to avoid the time to do of a full dump. This
is experiment was also to see how feasible it was to insert some
"mkdir+creat" in the notmuch-tag critical path.

Since a few people have mentioned this, I should confess that
there are (at least) 2 performance bugs lurking in nmbug that make it
probably not yet suitable for large scale tag syncing.

1) I did not get the merging working with only the index, so 
   nmbug currently makes a temporary checkout to do the merge.

2) transfering tags from the git repo to xapian is currently quite slow
   because it does one call to git tag for each tag, rather than
   constructing an input for "notmuch restore".  

I _think_ both of these are fixable in principle.  Maybe somebody with
better git internals knowledge than I would like to take a look at (1). 
(2) is just a SimpleMatterOfProgramming (TM). Patches, as they say, are
welcome ;).

d




More information about the notmuch mailing list