[PATCH 3/4] lib: handle DatabaseModifiedError in _n_message_ensure_metadata

Sat Feb 25 12:39:41 PST 2017

On Sat, 25 Feb 2017, David Bremner <david at tethera.net> wrote:
> Jani Nikula <jani at nikula.org> writes:
>
>> On Fri, 24 Feb 2017, David Bremner <david at tethera.net> wrote:
>>> The retries are hardcoded to a small number, and error handling aborts
>>> than propagating errors from notmuch_database_reopen. These are both
>>> somewhat justified by the assumption that most things that can go
>>> wrong in Xapian::Database::reopen are rare and fatal (like running out
>>> of memory or disk corruption).
>>
>> I think the sanity of the implementation hinges on that assumption. It
>> makes sense if you're right, but I really have no idea either way...
>
> That was my conclusion from talking to Olly (xapian upstream).
>
> 24-02-2017 08:12:57 < bremner> any intuition about how likely
>    Xapian::Database::reopen is to fail? I'm catching a
>    DatabaseModifiedError somewhere where handling any further errors is
>    tricky, and wondering about treating a failed reopen as as "the
>    impossible happened, stopping"
>
> 24-02-2017 16:22:34 < olly> bremner: there should not be much scope for
>  failure - stuff like out of memory or disk errors, which are probably a
>  good enough excuse to stop
>
> I could add that to the commit message?

Yes, please, it'll come in handy when the memory of the discussion has
faded! (Like two weeks from now... ;)

>
>>> +
>>> +	    /* all the way without an exception */
>>> +	    success = TRUE;
>>
>> Nitpick, if you don't intend to use that variable to return status from
>> the function, you can just break here, and get rid of the variable. But
>> no big deal.
>>
>
> I think I have some kind of mental block about break and continue. But
> it could even be a goto, those I understand ;).

Heh, as I said, no big deal.

>
>>
>>> +	} catch (const Xapian::DatabaseModifiedError &error) {
>>> +	    notmuch_status_t status = notmuch_database_reopen (message->notmuch);
>>> +	    if (status != NOTMUCH_STATUS_SUCCESS)
>>> +		INTERNAL_ERROR ("unhandled error from notmuch_database_reopen: %s\n",
>>> +				notmuch_status_to_string (status));
>>> +	    success = FALSE;
>>> +	} catch (const Xapian::Error &error) {
>>> +	    INTERNAL_ERROR ("A Xapian exception occurred fetching message metadata: %s\n",
>>> +			    error.get_msg().c_str());
>>> +	}
>>
>> If the assumption is that these really are rare cases (read: shouldn't
>> happen), INTERNAL_ERROR is an improvement over leaking the
>> exception. Otherwise, I think we'd need to propagate the status all the
>> way to the API, which would really be annoying.
>>
>> I guess I think this is a worthwhile improvement no matter what.
>
> Yeah, I had a go at that in the previous longer series, but I was not
> very happy with the (incomplete) results

Err, I'm sorry for not being clear, I meant that this patch *series* is
a worthwhile improvement as-is, even if reopen did have common and
unrecoverable failure modes (which it shouldn't). This series catches a
previously uncaught exception, tries to do something sensible about it,
and failing at that, causes an INTERNAL_ERROR. That's a huge
improvement. If we end up seeing those errors later, we can reconsider
the error propagation.

BR,
Jani.