notmuch-lib questions and observations

Tomi Valkeinen tomi.valkeinen at iki.fi
Mon Nov 18 06:40:52 PST 2013


Hi,

I found out about notmuch quite recently, and now I've been tinkering
with it, prototyping a GUI client. I have some questions and observations:

1.

The API seems to be a bit broken. I think many of the functions should
return notmuch_status_t. I encountered this issue with get_header() and
get_date(), which I happened to call after the DB had been changed
twice, leading to Xapian::DatabaseModifiedError.

Neither function handle the exception, causing a crash, which is
obviously a bug, but even if they did handle the exception they don't
return any sensible error information. Even worse, consider
count_messages(), for which return value of 0 is valid.

So, as far as I see, many of the funcs should be changed to something like:

notmuch_status_t
notmuch_query_count_messages (notmuch_query_t *query, unsigned *count);


2.

This is more about Xapian, I guess. The behavior that a db reader will
start failing if the db has been changed twice is rather bad. If I'm not
mistaken, having a rather long read-only query could be blocked (or,
well, re-tried) forever, if there just happens to be a few db writes
during the read.

I think a better approach would be to allow only one change to the db if
there are open db readers. If a second db writer tries to open the db,
it would get a failure (instead of the readers).

Anyone know if this has been discussed, or if my suggestion is plain silly?

3.

How is a client using notmuch supposed to find out there are new
messages, and which messages are new?

My current thought is to make 'notmuch new' run a script that tags the
messages, and make it add a 'new-gui' or such tag to all new messages.
The client would then periodically make a query for that tag, and at the
same time remove the tag for any returned messages.

4.

Has there been discussion on returning integer IDs instead of strings
from various functions like notmuch_message_get_message_id() and
notmuch_tags_get()?

I have two things behind this question:

- Marshaling strings from native code to managed code requires
allocating memory and copying the string, whereas returning an int is
more or less a no-op [1][2]. E.g. at the moment if I fetch tag 'inbox'
for 10k messages, I'm creating a new 'inbox' string 10k times. I'd
rather fetch an int 10k times, and the 'inbox' string once.

- My prototype fetches the message ids for all the messages returned by
the query, so that it can later load the message if the user wants to
read it. Fetching and storing only an int per message versus a long-ish
string per message would most likely be good for performance with large
queries.

5.

This one is just a vague thought that came to my mind. At the moment
notmuch hides Xapian totally behind notmuch's interface, which probably
makes things simpler (and gives a nice C API), but also (afaik) prevents
using Xapian features that are not at the moment supported in the
notmuch API.

I wonder how would an approach work where notmuch would be a bit more
like a helper library, allowing full use of Xapian's features but making
it simple to manage notmuch database. So, for example, when making a
query, you'd create a Xapian query with notmuch, and then use Xapian to
run the query.

I don't have anything clear in mind, and obviously Xapian being C++
might make the whole idea unimplementable.

 Tomi


[1] That's on C#. I wouldn't be surprised if it's also the same with
other higher level languages.

[2] That's not entirely true, as strings can be passed as is, if the
managed code is given the ownership of the string, and the managed code
will free the string eventually.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20131118/d43b5e4c/attachment.pgp>


More information about the notmuch mailing list