[RFC] Split notmuch_database_close into two functions

Justus Winter 4winter at informatik.uni-hamburg.de
Thu Apr 12 10:19:39 PDT 2012


Quoting Austin Clements (2012-04-12 18:57:44)
>Quoth Justus Winter on Apr 12 at 11:05 am:
>> Quoting Austin Clements (2012-04-01 05:23:23)
>> >Quoth Justus Winter on Mar 21 at  1:55 am:
>> >> I propose to split the function notmuch_database_close into
>> >> notmuch_database_close and notmuch_database_destroy so that long
>> >> running processes like alot can close the database while still using
>> >> data obtained from queries to that database.
>> >
>> >Is this actually safe?  My understanding of Xapian::Database::close is
>> >that, once you've closed the database, basically anything can throw a
>> >Xapian exception.  A lot of data is retrieved lazily, both by notmuch
>> >and by Xapian, so simply having, say, a notmuch_message_t object isn't
>> >enough to guarantee that you'll be able to get data out of it after
>> >closing the database.  Hence, I don't see how this interface could be
>> >used correctly.
>> 
>> I do not know how, but both alot and afew (and occasionally the
>> notmuch binary) are somehow safely using this interface on my box for
>> the last three weeks.
>
>I see.  TL;DR: This isn't safe, but that's okay if we document it.
>
>The bug report [0] you pointed to was quite informative.  At its core,
>this is really a memory management issue.  To sum up for the record
>(and to check my own thinking): It sounds like alot is careful not to
>use any notmuch objects after closing the database.  The problem is
>that, currently, closing the database also talloc_free's it, which
>recursively free's everything derived from it.  Python later GCs the
>wrapper objects, which *also* try to free their underlying objects,
>resulting in a double free.
>
>Before the change to expose notmuch_database_close, the Python
>bindings would only talloc_free from destructors.  Furthermore, they
>prevented the library from recursively freeing things at other times
>by internally maintaining a reverse reference for every library talloc
>reference (e.g., message is a sub-allocation of query, so the bindings
>keep a reference from each message to its query to ensure the query
>doesn't get freed).  The ability to explicitly talloc_free the
>database subverts this mechanism.

Exactly.

>So, I've come around to thinking that splitting notmuch_database_close
>and _destroy is okay.  It certainly parallels the rest of the API
>better.  However, notmuch_database_close needs a big warning similar
>to Xapian::Database::close's warning that retrieving information from
>objects derived from this database may not work after calling close.

Yes, but then again one should always expect function calls to fail
and most APIs have mechanisms to communicate failures.

OTOH this might be an indication that the notmuch API should be
redesigned. Both alot and afew have their own wrappers around the
notmuch API to work around some limitations (e.g. changes to messages
are enqueued and executed at some point, with some kind of mechanism
to cope with the notmuch database temporarily not being available,
message objects have to be re-fetched if they got outdated (IIRC,
whatever that means)).

Justus


More information about the notmuch mailing list