Bug#842291: notmuch processes frequently stuck in select()
Daniel Kahn Gillmor
dkg at fifthhorseman.net
Wed Nov 23 09:19:05 PST 2016
Control: affects 842291 + gpgsm dirmngr
On Wed 2016-11-23 03:50:40 -0500, David Bremner wrote:
> David Bremner <david at tethera.net> writes:
>
>> Brian May <bam at debian.org> writes:
>>> strace shows notmuch looping in select.
>>>
>>> select(10, [9], [], NULL, {1, 0}) = 0 (Timeout)
>>> select(10, [9], [], NULL, {1, 0}) = 0 (Timeout)
>>> select(10, [9], [], NULL, {1, 0}) = 0 (Timeout)
>>> select(10, [9], [], NULL, {1, 0}) = 0 (Timeout)
>>> etc
>>>
>>
>> a backtrace would be helpful.
>>
>> d
>
> Nevermind, I managed to download the list archive for debian-devel and
> replicate the bug.
>
> The bug seems to be related to smime signature verification. After
> adding the attached mail message (and running notmuch-new), to replicate
> the hang it suffices to run
>
> % notmuch show --decrypt id:20161116T143809.GA.21721.stse at fsing.rootsland.net
>
> As far as workarounds, turning off decryption / signature verification
> should allow you to at least view the thread.
I've noticed similar behavior, and it seems to correlate with gpgsm
asking dirmngr for an update to CRLs related to S/MIME certs.
some CRLs simply do not respond at all (resulting in a timeout), and
some do not respond, or are laggy, when accessed over tor specifically.
I see a couple possible ways to consider resolving this, none of them
great, and i don't know exactly how to implement any of them:
0) turn off CRL updates entirely during s/mime signature verification
1) do s/mime signature verification without CRL updates, but schedule
CRL checks to happen in the background for dirmngr, so that future
verifications will reflect the cert validity
2) have dirmngr avoid checking CRLs that it knows it has already
updated recently
3) tell dirmngr to use much shorter CRL fetch timeouts
Some example traffic from my dirmngr that uses tor (complete with
timestamps indicating just how bad the delays can be):
Nov 22 14:08:24 alice dirmngr[11976]: no CRL available for issuer id 770B4DA5913F2572B9F679AE0819FB7D77572689
Nov 22 14:08:24 alice dirmngr[11976]: fetching CRL from 'http://crl.ll.mit.edu/getcrl/LLCA3'
Nov 22 14:08:44 alice dirmngr[11976]: resolving 'crl.ll.mit.edu' failed: No data
Nov 22 14:08:44 alice dirmngr[11976]: can't connect to 'crl.ll.mit.edu': host not found
Nov 22 14:08:44 alice dirmngr[11976]: error retrieving 'http://crl.ll.mit.edu/getcrl/LLCA3': Unknown host
Nov 22 14:08:44 alice dirmngr[11976]: crl_fetch via DP failed: Unknown host
Nov 22 14:08:45 alice dirmngr[11976]: no CRL available for issuer id 770B4DA5913F2572B9F679AE0819FB7D77572689
Nov 22 14:08:45 alice dirmngr[11976]: fetching CRL from 'http://crl.ll.mit.edu/getcrl/LLCA3'
Nov 22 14:09:05 alice dirmngr[11976]: resolving 'crl.ll.mit.edu' failed: No data
Nov 22 14:09:05 alice dirmngr[11976]: can't connect to 'crl.ll.mit.edu': host not found
Nov 22 14:09:05 alice dirmngr[11976]: error retrieving 'http://crl.ll.mit.edu/getcrl/LLCA3': Unknown host
Nov 22 14:09:05 alice dirmngr[11976]: crl_fetch via DP failed: Unknown host
Nov 22 14:09:05 alice dirmngr[11976]: no CRL available for issuer id 26FD002905277B015EE9B2A3C092A348F28A4C6B
Nov 22 14:09:05 alice dirmngr[11976]: fetching CRL from 'http://crl.startssl.com/sca-client1.crl'
Nov 22 14:09:25 alice dirmngr[11976]: resolving 'crl.startssl.com' failed: No data
Nov 22 14:09:25 alice dirmngr[11976]: can't connect to 'crl.startssl.com': host not found
Nov 22 14:09:25 alice dirmngr[11976]: error retrieving 'http://crl.startssl.com/sca-client1.crl': Unknown host
Nov 22 14:09:25 alice dirmngr[11976]: crl_fetch via DP failed: Unknown host
Nov 22 14:09:25 alice dirmngr[11976]: no CRL available for issuer id 26FD002905277B015EE9B2A3C092A348F28A4C6B
Nov 22 14:09:25 alice dirmngr[11976]: fetching CRL from 'http://crl.startssl.com/sca-client1.crl'
Nov 22 14:09:45 alice dirmngr[11976]: resolving 'crl.startssl.com' failed: No data
Nov 22 14:09:45 alice dirmngr[11976]: can't connect to 'crl.startssl.com': host not found
Nov 22 14:09:45 alice dirmngr[11976]: error retrieving 'http://crl.startssl.com/sca-client1.crl': Unknown host
Nov 22 14:09:45 alice dirmngr[11976]: crl_fetch via DP failed: Unknown host
that's a 20-second lag between each failed check, adding up to 80
seconds delay in rendering a single thread where 4 messages were signed
by S/MIME keys signed by two different authorities.
Fwiw, crl.ll.mit.edu doesn't seem to respond over tor on port 80 at all
in some cases, and in other cases takes nearly a minute to reply:
0 dkg at alice:/tmp/cdtemp.Ue45bu$ time wget -q 'http://crl.ll.mit.edu/getcrl/LLCA3'
real 0m0.694s
user 0m0.008s
sys 0m0.008s
0 dkg at alice:/tmp/cdtemp.Ue45bu$ time torsocks wget -q 'http://crl.ll.mit.edu/getcrl/LLCA3'
real 0m58.828s
user 0m0.008s
sys 0m0.008s
0 dkg at alice:/tmp/cdtemp.Ue45bu$
Any thoughts on the best way to pursue this?
--dkg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 962 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20161123/5d15aabe/attachment.sig>
More information about the notmuch
mailing list