[notmuch] Idea for storing tags

Scott Morrison smorr at indev.ca
Mon Jan 11 20:11:42 PST 2010


Thought you would be interested in my experiences and thoughts from actually doing this kind of stuff.  

With my software MailTags (www.indev.ca/MailTags.html) and I have looked at all these options and decided to go with storing tags in headers (in  json formatted data for the X-MailTags header)

I have thought seriously about using pseudo emails stored in a specially named directory but feel there are a couple of issues with this.
	1.  synchronization of tag data with emails -- if they are in a subfolder then it presents the issue of maintaining this subfolder when managing emails (moving, deleting, duplicating etc) and any .tag folder unaware clients are likely cause an breakage in tagdata/message association.  One way of doing this is to have a global .tag folder.

	2. what happens if that message is archived or moved to an exclusively local cache -- eg. Mail.app on OS X can easily move IMAP messages to a folder resident on the computers computers? -- 
	3. what happens with duplicates of emails -- I would assume that the message id would be the key to match the tag data to the message.  In this system a duplicate of a message could not have a different set of tags from the original (not that this would necessarily be desirable.)
	

As I mentioned, I went with tags in headers -- though this has its own drawbacks.
	Your mention of potential leakage (aka inadvertent disclosure of tag data) is real -- but only if the client used to bounce/forward is not the one to tag the message (one would assume that if a client can tag, it can know to exclude the tags in a bounce.)   Mail.app -- which I am pluging into does not forward headers -- though it will include all headers in a bounce -- but chance are you aren't tagging messages you are bouncing.:)

	The performance issue is very real -- because it means that somehow messages have to rewritten to the IMAP server -- IMAP doesn't have a mechanism AFAIK for updates.  Additionally, IMAP doesn't have a mechanism for simply replacing one message data with another -- a new message must be written and the old message must be deleted and the message IMAP UID will change, and the client will have to deal with this especially if it is cache the messages.

	Also GMAIL IMAP is an issue-  gmail IMAP is not IMAP -- it simply doesn't work like a true imap server -- writes to folders in gmail IMAP are translated to database updates where it is attributing a single record of the message with the folder it was "written" to.   Changing headers on a gmail IMAP message simply will not work because it will will reject the message as update of the single record (and not actually write the new data).

Still tags in headers meant that I didn't have to worry about making sure that the .tags folder is maintained appropriate (throughout moves and deletions) and that the data is stored much closer to the message for data recovery if it is ever needed and for archiving tags. -- in anycase -- this is what I have working -- though I am open to considering new approaches.

Scott

ps.  
also see my post to the mailtags-list from a few years back
http://lists.madduck.net/pipermail/mailtags/2007-August/msg00017.html

On 2010-01-11, at 5:19 PM, martin f krafft wrote:

> Folks, over in #notmuch, we just floated an idea that I'd like to
> get out to you. We've been debating storing tags for messages.
> Therefore I am cross-posting. Please forgive me.
> 
> So far, there are two approaches:
> 
> 1. External database, which has the downside of not being
>  synchronisable with standard IMAP, like the rest of your mail
>  (assuming you use IMAP). Also, it's possible for mailstore and
>  database to get out of sync.
> 
> 2. In-headers, which has the downside of leaking (e.g. when
>  bouncing), and incurs the risks associated with message rewrites
>  (which I think is pretty much ignorable, but it's still there).
>  Also, there's a performance issue, but in the context of an
>  indexer like notmuch, this is negligible.
> 
>  The leakage is real, though and I think it makes in-headers
>  unusable. After all, I don't ever want anyone else to know that
>  I tag e-mails from my boss as "from-idiots", and I forward and
>  bounce mail on a regular basis. I could tell my MTA to remove
>  those headers, but I might forget to do that on a new system.
> 
> We also previously determined that IMAP keywords are pretty much
> useless as they are stored per mailbox, not per message, not
> standardised, and limited in their length anyway [0]. This also
> means that we don't really need to investigate sensibly storing tags
> in Maildir (e.g. with xattrs), because IMAP cannot transport them.
> 
> 0. http://lists.madduck.net/pipermail/mailtags/2007-August/msg00016.html
> 
> Seriously, who implemented IMAPv4rev1 and what sort of crack were
> they smoking??
> 
> I remember there was some KDE groupware contacts manager that used
> IMAP to synchronise contacts. At first, this sounds horrible, but
> when you detach IMAP from RFC822, it becomes a generic synchronising
> protocol. The next step is then straight forward, and I want to
> share this idea with you:
> 
> How about using pseudo-mails stored in Maildir and synchronised by
> IMAP? E.g. every folder could have a subfolder .TAGS and if we find
> a way to smartly pair messages between parent and subfolder, we'd
> have a tag store alongside the mailstore it refers to, but without
> the danger of leakage, and without having to rewrite messages.
> 
> The major problem with this is when clients don't understand this
> "protocol", for then they will display all .TAGS folders as regular
> IMAP folders, and try to treat the messages therein as regular
> mails. Somewhere sometime this is bound to blow up and I don't
> really know how to prevent that.
> 
> Anyway, the idea is out now. Thoughts?
> 
> -- 
> martin | http://madduck.net/ | http://two.sentenc.es/
> 
> echo Prpv a\'rfg cnf har cvcr | tr Pacfghnrvp Cnpstuaeic
> 
> spamtraps: madduck.bogus at madduck.net
> _______________________________________________
> mailtags mailing list
> mailtags at lists.madduck.net
> http://lists.madduck.net/listinfo/mailtags



More information about the notmuch mailing list