[RFC PATCH 00/13] Modular message store code
Mark Walters
markwalters1009 at gmail.com
Wed Feb 15 16:56:27 PST 2012
On Wed, 15 Feb 2012 17:01:53 -0500, Ethan Glasser-Camp <glasse at cs.rpi.edu> wrote:
> Hi guys,
>
> I'm submitting as RFC this patch series, which introduces the idea of
> a "mailstore", a "class" that defines how to access mail, instead of
> currently assuming it's always some Maildir-ish hierarchy that
> contains a bunch of mail.
>
> This was listed as a wishlist item on
> http://notmuchmail.org/feature-requests/, so I went ahead and took a
> crack at it, learning a lot about the codebase as I did. I'm sure
> there are tons of stylistic concerns so I'd like to get as much
> feedback as possible, starting of course with "Does a feature like
> this have a chance of ever making it in" and followed by "Am I on the
> right track".
>
> Note that this series breaks the language bundings; the Python
> bindings have very minimal tests so I very minimally fixed them
> (probably still broken in other ways), but the Ruby and Go bindings
> are probably super broken. Note also that the one message = one file
> approach is pretty thoroughly embedded into Notmuch and there are lots
> of places (again such as the Python bindings) where this has yet to be
> rooted out.
>
> They say an interface isn't trustworthy until you've implemented it
> three times, so while most of the patches define the interface, the
> last patch adds support for an experimental CouchDB backend. It's got
> at least one known bug (it indexes everything, whether or not it's a
> mail object), sometimes it hangs when trying to access a message, and
> it's definitely leaking memory in notmuch-new. I haven't done strict
> timing comparisons but it seems like notmuch-search and notmuch-show
> are approximately as fast as with maildir and notmuch-new takes maybe
> 25% longer. Nevertheless, it is included as a demonstration that the
> interface is at least plausible.
>
> The implementation of "mailstores" follows these principles:
>
> - Messages still have "filenames", but the mailstore gets to define
> its own semantics for these filenames (document ids, file + byte
> offset..). _notmuch_message_ensure_filename_list converts all
> filenames coming out of the DB to be absolute paths centered at the
> user's database path, which is inconvenient for Couchdb, but workable.
Obviously I have not looked at the patch set in detail yet but I have a
quick question. Since you are allowing more general filenames anyway
couldn't you encode mailstore in filename? Eg
mbox://some-path[:byte-postion], or "imap://server..."
This would allow lots of different types of mailstore to be used
concurrently, and would push all the mailstore knowledge down into the
file handling functions and away from the callers of file handling
functions.
Of course there may be lots of good reasons why this doesn't work.
Best wishes
Mark
> - The user keeps all mail in one mailstore. The alternative, which is
> that each message can be in a different mailstore, seemed like a lot
> more work. "One mailstore" makes sense when it's cases like maildir
> vs. couchdb, but if we decide to introduce a "mbox" backend --
> directory tree with mbox files -- then it might "overlap" with the
> maildir mailstore, and then who knows?
>
> Patch 1 introduces the configuration parameter database.type, which
> will be used to select a mailstore type.
>
> Patch 2 introduces the most important API for a mailstore, notmuch_mailstore_open, and makes it required when creating a message_file. Patch 3 fixes the Python breakage this creates.
>
> Patch 4-6 replace the other places where files are opened directly with calls to notmuch_mailstore_open.
>
> Patch 7-8 prepare notmuch-new to be more generic. I couldn't find an elegant way to combine add_files logic with other mailstores, so I just decided each mailstore should have its own add_files function.
>
> Patches 9-11 add other functions to the mailstore API -- to rename files, to close files, and to "construct" a mailstore. Patch 12 uses the "close" API to close files (where necessary).
>
> Patch 13 proposes the CouchDB mailstore as one block of code.
>
> Points to address:
>
> - Where to put the new notmuch_mailstore_t* parameter in all these functions? I applied a "decreasing order of importance" heuristic, but it's a little weird in places like notmuch_database_open and notmuch_database_create.
>
> - Is there a better, more elegant way to pass around mailstore objects without adding parameters to each function? Additionally, should I be using some other class-like mechanism for mailstores instead of hacks involving structs?
>
> - Should this be configured under [database] or perhaps under some other new heading?
>
> - How strict is the rule that braces shouldn't be there if the body of a loop/conditional is only one line? This feels really strange to me coming from Python.
>
> - If I'm already touching code, should I add other drive-by fixes, as in patch 05, or should I resolutely refuse to change anything, as in patch 07?
>
> - Should something like the CouchDB backend be optional, and if so, what mechanisms do I need to use to make that happen?
>
> Thanks so much for your time!
> Ethan
>
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
More information about the notmuch
mailing list