Better Gmail handling by not using Notmuch tags

Mark Anderson MarkR.Anderson at amd.com
Fri Sep 14 09:27:23 PDT 2012


On Fri, 14 Sep 2012 09:50:01 +0200, Rainer M Krug <R.M.Krug at gmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 13/09/12 17:15, Damien Cassou wrote:
> > On Thu, Sep 13, 2012 at 5:13 PM, Jeremy Nickurak 
> > <not-much at trk.nickurak.ca> wrote:
> >> Gmail doesn't have folders, of course, it has labels, which are
> >> approximately equivalent to notmuch tags. The key difference being
> >> that a message can only be in one folder, but it can have multiple
> >> tags/labels.
> > 
> > Gmail exports its labels as IMAP folders: an email with multiple
> > labels will be duplicated in multiple folders (one per
> > label). That's why I'm asking if it would be possible to manupale
> > folders from Notmuch instead of tags.
> > 
> 
> I don't think there is an easy solution. notmuch uses a maildir and
> tags. Gmail needs to be synced to this local maildir earlier, and this
> is where I think the problem comes in: I am not aware of any sync tool
> which maintains the gmail labels, as they are in in the imap context
> folders.
> 
> I think the only real solution woud be:
> 
> Download from gmail -> local:
> 1) download only the "All Mail" folder
> 2) implement a tagging tool which syncs the gmail labels to notmuch tags

Gmail's IMAP protocol does expose a folder hierarchy which you can use
to reverse engineer the tag cloud of each email.  

Using Offlineimap will happily sync a your Gmail such that each mail
will show up in every folder corresponding to the tags with which the
mail is tagged.  When notmuch scans the mail, it will collapse these
multiple mails into one mail message by message-id: and you just need a
script to translate folders into tags.  The incoherence between notmuch
tag and mail folders then communicates something different if you are
pre-sync or post-sync. I am calling the offlineimap call "sync", so once
we run it (the first time for sure) we are in 'post-sync'.


You might want to take this chance to make your tag cloud coherent
between Notmuch and what exists in the folders, which works out to
something like this for every tag/folder pair in your gmail IMAP
directory (assuming it's synced to a Maildir repository) and notmuch DB.

  notmuch tag +TagX folder:FolderX and not tag:TagX 

Although strictly speaking it isn't actually necessary to put the 
"and not tag:TagX" term, as this term this is now implicitly added by
notmuch to improve performance and avoid touching database entries that
already have the tag, I include it to demonstrate the coherence between
TagX and FolderX.

Then you'll want to handle TagX being removed on the Gmail side too,
so you'll want to do something like this to remove the notmuch TagX.
 
  notmuch tag -TagX tag:TagX and not folder:FolderX

Then there's the reverse direction to consider.

When I see TagX in notmuch, and using FolderX as the proxy for Gmail
tags, then I assume that the user added TagX in notmuch and I need to
synchronize the change.  This one is a bit trickier.

  notmuch search --output=files tag:TagX and not folder:FolderX 

will give me the list of filenames, but I need to add them to a folder,
so it's time for bash, or your favorite script language.  Spaces in
filenames or tags are your bane here, then you'll want to do something
fancier than just the $() interpolation.

notmuch search --output=files tag:notmuch and not folder:notmuch |
xargs perl -e'while (defined($_ = shift(@ARGV))) {my $file =
filename($_); system("cp $_ $MAILDIR/notmuch/cur/$file");}'

Then when I see a mail in FolderX without TagX, that indicates that I
have removed the tag from notmuch, so I will want to remove the copy of
the file from FolderX, and only FolderX.

This requires you to scan through the list of filenames associated with
the msg-id and delete the file in FolderX.

Of course this is terrible on performance, as you will have lots of
copies of mails when you have lots of tags on your mail, but here's a
summary of the actions you need to coordinate to keep them in sync.


Stage      FolderX    TagX      Action
=====      =======    ====      ======
post-sync  No         No        No Action

post-sync  No         Yes       Gmail TagX removed caused FolderX copy
                                to be removed, remove Notmuch TagX

post-sync  Yes        No        Gmail TagX added, add Notmuch TagX

post-sync  Yes        Yes       No Action

pre-sync   No         No        No Action

pre-sync   No         Yes       Notmuch TagX added, copy mail to FolderX
                                to add Gmail TagX corresponding to
                                notmuch TagX 

pre-sync   Yes        No        Notmuch TagX removed, delete mail copy
                                in FolderX

pre-sync   Yes        Yes       No Action

The worst part about this, is that any interrupted action must be
retried until successful, unless we have information about the relative
times of the actions.  If instead of trying to rationalize the two
"states" of my messages, I was trying to synchronize the changes, then I
just need to go down the list of changes and take the appropriate action.

You might have other actions, such as a true delete in Notmuch, which
should remove all copies of the email before doing a mail sync.

The benefit of using the mail sync is it uses a widely distributed mail
synchronization model, but it really tags expensive to synchronize.  It
gets better if you use the Gmail imap extensions that can list the tags
without your client requesting a copy of the entire email for each tag
the mail has.  However, Even when you have that, you don't have
bulletproof mail, because the actions need to be guaranteed to complete
before synchronization and after synchronization, and any user changes
need to be held off, as they _will_ be interpreted incorrectly if they
take place during the pre-sync, sync, post-sync window.

You can simplify this if you make guarantees in your usage model.  That
you will never do tagging operations during a pre-, sync, post- cycle,
or that you only do synchronization one way or the other, instead of
full bidirectional sync.

It's a difficult problem, I look forward to seeing other solutions
proposed.

Thanks,
-Mark Anderson

> upload local -> gmail
> 1) upload "All Mail folder
> 2) assign on gmail the labels corresponding to the notmuch tags.
> 
> The step 1 could be done by any sync tool available for this (offlineimap, ...)
> 
> step 2 needs to be developed - no idea how, but it surely would be really usefull, because then
> notmuch would even become a perfect tool for gmail backup as well.
> 
> Cheers,
> 
> Rainer
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
> 
> iEYEARECAAYFAlBS4akACgkQoYgNqgF2egoGhwCaAgfXQUAK4RK1v22JOhgYXfR1
> +C8AnRU892SrxK7IYN9xoxhM865Y+vTA
> =ma75
> -----END PGP SIGNATURE-----
> 
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
> 



More information about the notmuch mailing list