[PATCH 2/5] Add quotes around id:"message-id" queries.

Olly Betts olly at survex.com
Mon Jul 5 00:33:50 PDT 2010


On Fri, Jul 02, 2010 at 05:04:46PM +0400, Dmitry Kurochkin wrote:
> On Fri, 2 Jul 2010 04:41:43 +0000 (UTC), Olly Betts <olly at survex.com> wrote:
> > On 2010-07-01, Dmitry Kurochkin wrote:
> > > -  (concat "id:" (notmuch-show-get-prop :id props)))
> > > +  (concat "id:\"" (notmuch-show-get-prop :id props) "\""))
> > 
> > This is probably a good idea (the ".." example is arguably a Xapian bug so
> > that should be fixed soon, but you find all sorts of junk in message-ids.
> 
> If I comment out add_valuerangeprocessor call in
> notmuch_database_open(), ids with .. are matched fine with no quotes.

Yes, the code which checks for ranges is disabled if there are no possible
ranges to find.

> So it seems that xapian uses the ValueRangeProcessor for all terms
> while it should be used for one value parsing only. Is this correct?

The issue is that if there's a ".." in there you have to ask the VRPs to
find out if it is a range they understand or not, so they have to be called
first in such cases (otherwise the same prefix couldn't be made to work for
ranges and single term filters).  There needs to be some sort of fallback
to considering boolean filters if there isn't a valid range though.

> Is there a xapian bug for this?

I couldn't find a ticket for it, but I was aware of the issue.

I've committed a fix to Xapian now (r14790 on trunk), which should be in
Xapian 1.2.3 when it gets released.

> I have found a xapian bug #128 "Allow queryparser to treat some prefixes
> as literal text". Seems to be just what we need here. Perhaps instead of
> quoting in emacs client, we can wait for the value range parsing fix
> (can be fixed in minor release?) and use #128 when it is available. IMHO
> should be good enought in most cases. What do you think?

The main problem at the moment is with "..", which is now fixed on trunk.
So any Xapian version with #128 fully addressed will handle ".." in
message-ids fine anyway.

With current trunk, message-ids with whitespace or ')' in will still
misbehave unless you quote them.  If the "FieldProcessor" idea in #128 were
implemented, you could arrange for whitespace and ')' to be included, but
then it would be impossible to end a message-id term - it would span to the
end of the query string, which I think would surprise most users.

The ability to quote terms discussed in #128 is already implemented (that
is what you've been using!) and I think that using this selectively is
probably the best way to deal with this.

If you only try to quote message-ids which either:

* contain whitespace, "..", or ')'
* start with '"'

Then the only cases which break with older Xapian will be those which
wouldn't work there anyway, plus message-ids which start with a '"' (which 
seem rare - I couldn't find any in my mail folders).

Cheers,
    Olly


More information about the notmuch mailing list