Distributed Notmuch
Ethan Glasser-Camp
glasse at cs.rpi.edu
Sun Jan 8 02:23:59 PST 2012
Hi guys,
It's kind of academic for me right now because I'm mostly just using one
computer, but one reason I've hesitated to switch over entirely to
notmuch is that it's hard to distribute across many machines. The last
time I wrote the list about this, David Bremner pointed me to gitmuch in
the notmuch-scripts package, which uses git to synchronize tags. He
wrote, "No one claims this is a great solution, but it works now."
In brainstorming about the One True Mail Setup, my friend suggested to
me that Maildir/IMAP are not really the best choices for mail storage.
Among other flaws: to synchronize mail via IMAP you have to check the
headers of each message, which means a lot of bandwidth; you can't
compress Maildir, meaning lots of wasted space; mechanisms to
synchronize arbitrary tags have to be bolted on top. My friend suggested
that instead it might be better to dump mail into some kind of database,
for example CouchDB, and synchronize it that way. Of course, doing
full-text indexing and tagging using an arbitrary DB would be a ton of
work, so instead it probably makes the most sense to keep a Xapian
instance on each node and feed all the mail to that. Tagging operations
still have to be replicated, probably by an oplog that's also kept in
Couch, so it's still a lot of work, but keeping things in Couch
automatically gets you a lot of the replication mechanisms, offline
access, etc., that would have to be bolted on/hacked up using tools like
nmbug/gitmuch/rsync. I also see in the wiki that someone proposes to use
git as the mail store, presumably for similar reasons. Xapian itself has
the idea of master/slave replication but that doesn't really get you
full offline access.
So my question for the wizards on this list is what their idea of the
One True Mail Setup would be in a perfect, or slightly better, world,
and what needs to be done to get there. I know some people use one
notmuch install that they access remotely. For myself, I'm on a pretty
limited Internet connection, so low bandwidth/offline access are big for
me, and despite Nicolas Sebrecht and Sebastian Spaeth's heroic work on
OfflineIMAP, it still uses a lot of bandwidth to sync. And obviously the
whole point of this exercise is tag synchronization..
Ethan
More information about the notmuch
mailing list