[PATCH] lib: add 'body:' field, stop indexing headers twice.

David Bremner david at tethera.net
Fri Mar 29 06:17:35 PDT 2019


David Bremner <david at tethera.net> writes:

> This follows a suggestion of Olly Betts to use the facility (since
> Xapian 1.0.4) to add the same field with multiple prefixes. The double
> indexing of previous versions is thus replaced with a query time
> expension of unprefixed query terms to the various prefixed
> equivalent.

This patch leads to approximately a 10% decrease in database size on our performance
suite (2.1G -> 1.9G) before compaction.  After compaction, old / new is
1.4G -> 1.3G

With the caveat that the benchmark machine was not completely idle, it
also leads to a roughly 10% speedup.

Existing indexing:

T00-new.sh: Testing notmuch new                         [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  Initial notmuch new   565.17	534.82	28.22	474632	0/13854576
  notmuch new #2        0.03	0.00	0.00	9512	0/160
  notmuch new #3        0.00	0.00	0.00	9368	0/8
  notmuch new #4        0.00	0.00	0.00	9412	0/8
  notmuch new #5        0.00	0.00	0.00	9384	0/8
  notmuch new #6        0.00	0.00	0.00	9388	0/8

T01-dump-restore.sh: Testing dump and restore           [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  load nmbug tags       16.25	2.65	3.05	12668	104/40104
  dump *                3.90	3.79	0.10	26048	0/27928
  restore *             4.51	4.10	0.41	9564	0/0

T02-tag.sh: Testing tagging                             [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  tag * +new_tag        374.69	197.56	169.55	118644	0/1818656
  tag * +existing_tag   0.00	0.00	0.00	9232	0/0
  tag * -existing_tag   318.47	151.46	164.56	36260	0/1819584
  tag * -missing_tag    0.00	0.00	0.00	9336	0/0

T03-reindex.sh: Testing tagging                         [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  reindex *             688.27	488.02	197.59	11142680	0/4908120
  reindex *             648.04	456.06	191.78	11139124	0/2696120
  reindex *             650.70	459.08	191.48	11139088	0/2696680

T04-thread-subquery.sh: Testing thread subqueries       [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  search thread:{} ...  2.45	2.29	0.15	94696	0/144
  search thread:{} ...  2.43	2.23	0.20	94228	0/144
  search thread:{} ...  2.46	2.26	0.20	94224	0/144

With new indexing:

T00-new.sh: Testing notmuch new                         [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  Initial notmuch new   494.31	466.96	24.28	447428	0/12093344
  notmuch new #2        0.03	0.00	0.00	9356	0/144
  notmuch new #3        0.01	0.01	0.00	9420	0/8
  notmuch new #4        0.00	0.00	0.00	9388	0/8
  notmuch new #5        0.00	0.00	0.00	9416	0/8
  notmuch new #6        0.01	0.00	0.01	9424	0/8

T01-dump-restore.sh: Testing dump and restore           [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  load nmbug tags       14.21	2.41	2.71	12664	0/38952
  dump *                3.70	3.57	0.12	26092	0/27928
  restore *             4.19	3.78	0.41	9412	0/0

T02-tag.sh: Testing tagging                             [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  tag * +new_tag        353.31	183.89	161.49	111244	0/1693872
  tag * +existing_tag   0.00	0.00	0.00	9316	0/0
  tag * -existing_tag   284.07	137.15	144.33	36712	0/1659200
  tag * -missing_tag    0.00	0.00	0.00	9240	0/0

T03-reindex.sh: Testing tagging                         [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  reindex *             640.19	431.23	196.99	10214564	1510/4504024
  reindex *             611.46	412.37	193.07	10211852	1056/2557688
  reindex *             612.95	415.40	194.97	10211848	0/2555032

T04-thread-subquery.sh: Testing thread subqueries       [0.4 large]
			Wall(s)	Usr(s)	Sys(s)	Res(K)	In/Out(512B)
  search thread:{} ...  2.34	2.12	0.21	96452	0/144
  search thread:{} ...  2.35	2.17	0.18	96208	0/144
  search thread:{} ...  2.33	2.08	0.25	94740	0/144


More information about the notmuch mailing list