Alternative (raw) message store (i.e. instead of maildir)
Ciprian Dorin Craciun
ciprian.craciun at gmail.com
Tue Aug 14 10:05:11 PDT 2012
On Tue, Aug 14, 2012 at 7:50 PM, Vladimir Marek
<Vladimir.Marek at oracle.com> wrote:
>> On the other hand I strongly sustain having a more optimized
>> backend for emails, especially for such cases. For example a
>> BerkeleyDB would perfectly fit such a use case, especially if we store
>> the body and the headers in separate databases.
>>
>> Just a small experiment, below are the R `summary(emails)` of the
>> sizes of my 700k emails:
>> ~~~~
>> Min. 1st Qu. Median Mean 3rd Qu. Max.
>> 8 4364 5374 11510 7042 31090000
>> ~~~~
>>
>> As seen 75% of the emails are below 7k, and this without any compression...
>>
>> Moreover we could organize the keys so that in a B-Tree structure
>> the emails in the same thread are closer together...
>
> Now I'm not sure if you talk about some berkeley-db fuse filesystem or
> direct support in notmuch.
No tricks. :)
I proposed -- better said queried if possible or at least wanted
-- to have an internal interface (SPI) that any mail store would have
to implement in order to be indexed and used by notmuch. I guess the
interface would be quite lightweight, and would need just the
following:
* open store;
* create a cursor iterating through all the emails, yielding only the keys;
* read the envelope (as a byte blob) of a particular key; (used
only for displaying thread lists, etc.;)
* read the body (as a byte blob) of a particular key;
* maybe create a cursor iterating over all those emails that have
changed since a particular timestamp;
> I don't have enough cycles to modify notmuch,
> so I started to look at simpler (codewise) solution ...
>
> To summarize, what I personally want from the mail storage
We need to make a distinction between current storage (like
maildir) and archival storage (like the Zip or my proposal).
> - ability to read and write mails
It could be done through a small CLI over the proposed API.
> - should work with mutt (or mutt-kz)
This would eliminate any proposal not involving a FUSE wrapper...
> - simple backup to windows drive (files can't contain double colon ':')
This could be done via a dump like facility. (BerkeleyDB supports
this natively through a tool.)
More information about the notmuch
mailing list