[notmuch] Fwd: Re: nested tag trees (was: Mail in git)

Ben Gamari bgamari at gmail.com
Wed Feb 17 20:38:13 PST 2010


--- Begin forwarded message from martin f krafft ---
From: martin f krafft <madduck at madduck.net>
To: Ben Gamari <bgamari at gmail.com>
Date: Wed, 17 Feb 2010 22:46:13 -0500
Subject: Re: nested tag trees (was: [notmuch] Mail in git)

You ought to have sent to the list, and I want to send mine there
too, so please give permission.

also sprach Ben Gamari <bgamari at gmail.com> [2010.02.18.1620 +1300]:
> This is a very good point. From what I've read about the database
> format, I can't think of any way that reverse dependencies could be
> easily found, unfortunately. If there really is no way to do this, then
> we could have a problem. I'm not sure rewriting tens of megabytes
> everytime you receive a mail message is acceptable.

You would not need to do that, since the messages don't change, and
thus their blobs remain the same.

However, for every manipulation of a message, you would need to
iterate *all* tag trees (O(n)) and update the ones referencing the
message (also O(n)).

The entire process will still be O(n) per message, and O(m×n) for
all:

  messages=[list of messages]
  add_tags=[list of tags to add]
  remove_tags=[list of tags to remove]
  tagtrees=[all tag trees]
  trees_to_update=[]

  for t in remove_tags:
    if intersection(t.tree.children, messages):
      T = new_tree(t.name)
      write_tree(T, t.tree.children - messages)
      write_tree(t.tree, [])
      t.tree = T

  for t in add_tags:
    t.tree = new_tree(t.name)
    rewrite_tree(t.tree, messages)

This can probably be further optimised, but still: it's not quite as
nice as enumerating all parents of a message in O(1) time (which
would still result in O(m×n)).

--- End forwarded message ---


More information about the notmuch mailing list