[PATCH] lib: add 'body:' field, stop indexing headers twice.
David Bremner
david at tethera.net
Fri Mar 29 06:17:35 PDT 2019
David Bremner <david at tethera.net> writes:
> This follows a suggestion of Olly Betts to use the facility (since
> Xapian 1.0.4) to add the same field with multiple prefixes. The double
> indexing of previous versions is thus replaced with a query time
> expension of unprefixed query terms to the various prefixed
> equivalent.
This patch leads to approximately a 10% decrease in database size on our performance
suite (2.1G -> 1.9G) before compaction. After compaction, old / new is
1.4G -> 1.3G
With the caveat that the benchmark machine was not completely idle, it
also leads to a roughly 10% speedup.
Existing indexing:
T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
Initial notmuch new 565.17 534.82 28.22 474632 0/13854576
notmuch new #2 0.03 0.00 0.00 9512 0/160
notmuch new #3 0.00 0.00 0.00 9368 0/8
notmuch new #4 0.00 0.00 0.00 9412 0/8
notmuch new #5 0.00 0.00 0.00 9384 0/8
notmuch new #6 0.00 0.00 0.00 9388 0/8
T01-dump-restore.sh: Testing dump and restore [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
load nmbug tags 16.25 2.65 3.05 12668 104/40104
dump * 3.90 3.79 0.10 26048 0/27928
restore * 4.51 4.10 0.41 9564 0/0
T02-tag.sh: Testing tagging [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
tag * +new_tag 374.69 197.56 169.55 118644 0/1818656
tag * +existing_tag 0.00 0.00 0.00 9232 0/0
tag * -existing_tag 318.47 151.46 164.56 36260 0/1819584
tag * -missing_tag 0.00 0.00 0.00 9336 0/0
T03-reindex.sh: Testing tagging [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
reindex * 688.27 488.02 197.59 11142680 0/4908120
reindex * 648.04 456.06 191.78 11139124 0/2696120
reindex * 650.70 459.08 191.48 11139088 0/2696680
T04-thread-subquery.sh: Testing thread subqueries [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
search thread:{} ... 2.45 2.29 0.15 94696 0/144
search thread:{} ... 2.43 2.23 0.20 94228 0/144
search thread:{} ... 2.46 2.26 0.20 94224 0/144
With new indexing:
T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
Initial notmuch new 494.31 466.96 24.28 447428 0/12093344
notmuch new #2 0.03 0.00 0.00 9356 0/144
notmuch new #3 0.01 0.01 0.00 9420 0/8
notmuch new #4 0.00 0.00 0.00 9388 0/8
notmuch new #5 0.00 0.00 0.00 9416 0/8
notmuch new #6 0.01 0.00 0.01 9424 0/8
T01-dump-restore.sh: Testing dump and restore [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
load nmbug tags 14.21 2.41 2.71 12664 0/38952
dump * 3.70 3.57 0.12 26092 0/27928
restore * 4.19 3.78 0.41 9412 0/0
T02-tag.sh: Testing tagging [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
tag * +new_tag 353.31 183.89 161.49 111244 0/1693872
tag * +existing_tag 0.00 0.00 0.00 9316 0/0
tag * -existing_tag 284.07 137.15 144.33 36712 0/1659200
tag * -missing_tag 0.00 0.00 0.00 9240 0/0
T03-reindex.sh: Testing tagging [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
reindex * 640.19 431.23 196.99 10214564 1510/4504024
reindex * 611.46 412.37 193.07 10211852 1056/2557688
reindex * 612.95 415.40 194.97 10211848 0/2555032
T04-thread-subquery.sh: Testing thread subqueries [0.4 large]
Wall(s) Usr(s) Sys(s) Res(K) In/Out(512B)
search thread:{} ... 2.34 2.12 0.21 96452 0/144
search thread:{} ... 2.35 2.17 0.18 96208 0/144
search thread:{} ... 2.33 2.08 0.25 94740 0/144
More information about the notmuch
mailing list