Distributed Notmuch

Ethan Glasser-Camp glasse at cs.rpi.edu
Sun Jan 8 02:23:59 PST 2012


Hi guys,

It's kind of academic for me right now because I'm mostly just using one 
computer, but one reason I've hesitated to switch over entirely to 
notmuch is that it's hard to distribute across many machines. The last 
time I wrote the list about this, David Bremner pointed me to gitmuch in 
the notmuch-scripts package, which uses git to synchronize tags. He 
wrote, "No one claims this is a great solution, but it works now."

In brainstorming about the One True Mail Setup, my friend suggested to 
me that Maildir/IMAP are not really the best choices for mail storage. 
Among other flaws: to synchronize mail via IMAP you have to check the 
headers of each message, which means a lot of bandwidth; you can't 
compress Maildir, meaning lots of wasted space; mechanisms to 
synchronize arbitrary tags have to be bolted on top. My friend suggested 
that instead it might be better to dump mail into some kind of database, 
for example CouchDB, and synchronize it that way. Of course, doing 
full-text indexing and tagging using an arbitrary DB would be a ton of 
work, so instead it probably makes the most sense to keep a Xapian 
instance on each node and feed all the mail to that. Tagging operations 
still have to be replicated, probably by an oplog that's also kept in 
Couch, so it's still a lot of work, but keeping things in Couch 
automatically gets you a lot of the replication mechanisms, offline 
access, etc., that would have to be bolted on/hacked up using tools like 
nmbug/gitmuch/rsync. I also see in the wiki that someone proposes to use 
git as the mail store, presumably for similar reasons. Xapian itself has 
the idea of master/slave replication but that doesn't really get you 
full offline access.

So my question for the wizards on this list is what their idea of the 
One True Mail Setup would be in a perfect, or slightly better, world, 
and what needs to be done to get there. I know some people use one 
notmuch install that they access remotely. For myself, I'm on a pretty 
limited Internet connection, so low bandwidth/offline access are big for 
me, and despite Nicolas Sebrecht and Sebastian Spaeth's heroic work on 
OfflineIMAP, it still uses a lot of bandwidth to sync. And obviously the 
whole point of this exercise is tag synchronization..

Ethan



More information about the notmuch mailing list