Bug#683505: notmuch: FTBFS if built twice in a row: unrepresentable changes to source
Tomi Ollila
tomi.ollila at iki.fi
Thu Aug 2 14:12:28 PDT 2012
On Thu, Aug 02 2012, Austin Clements <amdragon at MIT.EDU> wrote:
> Quoth Jameson Graef Rollins on Aug 01 at 8:10 pm:
>> On Wed, Aug 01 2012, David Bremner <david at tethera.net> wrote:
>> > As I mentioned on IRC, the test only fails on the Debian build machines
>> > (building in a clean chroot using sbuild is not enough) so it isn't
>> > really clear how to duplicate the it. Perhaps building in a clean
>> > virtual machine without networking would do it. For which tests fail,
>> > see
>> >
>> > https://buildd.debian.org/status/fetch.php?pkg=notmuch&arch=i386&ver=0.13.2-1&stamp=1338740444
>> >
>> > I think the first things to fail are emacs tests. At a wild guess, it
>> > looks like all of the failing tests are related to emacs.
>>
>> From a cursory look that does appear to be the case. The non-emacs
>> tests that are also failing (json and crypto) are using
>> emacs_deliver_message. Do we have any idea what's going on here?
>
> There is one other illuminating tidbit in the buildd log:
>
> emacs-subject-to-filename: Testing emacs: mail subject to filename
> test-lib.sh: line 187: 30606 Terminated sleep 1
> FATAL: Unexpected exit with code 1
>
>>From a cursory glance, emacs-subject-to-filename appears to be the
> only test that calls test_emacs outside of a subtest and hence without
> stdout/stderr redirection.
>
> The line number is useless, but, assuming valgrind isn't enabled,
> there's only one place we sleep 1 in test-lib.sh: in the loop in
> test_emacs that waits for the Emacs server to start up. Furthermore,
> timeout sends SIGTERM by default, suggesting that we're timing out
> while we're spinning in that loop.
The situation sounds strangely familiar... I remember seeing 'sleep 1'
with ascending pid in process list around the same time I had this
(notmuch-test-wait) problem... I think the system was lacking the
server socket in /tmp/emacs-<pid>/ directory...
Hmm, now I remember something -- there was some error happening
in emacs startup and therefore the (server-start) was never executed
-- the test_emacs '()' in loop can never connect the socket.
In the above case it seems like the first test
test_emacs '(notmuch-hello) (test-output)' couldn't be executed.
and as there is no test/emacs.el file "$load_emacs_tests" is empty
(instead of --eval '(load "$TEST_DIRECTORY/emacs.el") -- so that
cannot break it.
Unfortunately I did not investigate that further (or it was my own
mistake that made emacs fail) -- but if that happens again and
one is monitoring the progress (maybe using larger value than '2m' for
timeout) the failing emacs can be entered by 'dtach -a $socket'.
The $socket can be found by executing 'ps aww | grep dtach'.
Tomi
More information about the notmuch
mailing list