RFC: adding larger test corpus, switching to xz

Sun Apr 9 10:45:02 PDT 2017

I currently have some WIP code that passes all tests with our default
corpus, but fails with the smallest performance corpus. The simplest
thing to do would be to add a small sample from our performance corpus
as one for our standard (correctness) suite. I'm currently looking at
146 LKML messages. Unpacked these are about 1.3M; they bloat the source
tarball by about 285K, which is large in relative terms (about 40%), but
small in absolute terms for most modern systems. If we switch to xz
compression, the resulting tarball is only 711K.

So comments:

1) is it worth it to have a larger test corpus to blow up our source
   tarball size?

2) Should we (independently) switch to xz compression for our tarballs?

I'm not very enthusiastic about complicating the test system with yet
another kind of artifact to be downloaded. I think if we want to
minimize the size of the test corpora, I'd probably just extract the
troublesome thread, which will work for testing my current bug, but
maybe not so good for finding future bugs.

d
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 658 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20170409/05fa46b9/attachment.sig>