`notmuch-escape-boolean-term': Broken for non-ascii characters
Austin T. Clements
aclements at csail.mit.edu
Tue Aug 12 07:33:00 PDT 2014
Quoting Moritz Ulrich <moritz at tarn-vedra.de>:
> Hello,
>
> I recently adopted notmuch as my primary way to read mail, so thank you
> for this great tool!
>
> Unfortunately, I ran into a problem of the Emacs side of the project
> when used in a non-ascii environment:
>
> Having a tag named 'uni-köln', the tag:-completion doesn't work.
>
> This is caused by `notmuch-escape-boolean-term' errornously escaping the
> above string:
>
> (notmuch-escape-boolean-term "uni-köln") => "\"uni-köln\""
>
> This is caused by `string-match' with the following errornously matching
> my tag:
>
> (string-match "[^!#-'*-~]" "uni-köln") => 5
> (string-match "[^!#-'*-~]" "uni-koln") => nil
>
> I'm not exactly sure how to tackle this - the Regexp was crafted to match
> (, ), " if I understand it correct. A simple way would be just adding
> more characters as a sort-of whitelist. A nicer solution would be
> converting it from [^...] to [...] to explicitly mark letters that needs
> to be escaped.
notmuch-escape-boolean-term used to use a blacklist, but we switched
to a whitelist because Xapian's own parser has changed over the years
in its handling of non-ASCII characters and invalidated our blacklist.
Ultimately it seemed much safer to go with a whitelist. Quoting
"uni-köln" isn't erroneous, it's just conservative.
Could you explain in more detail what's broken? I tried adding the
tag uni-köln to a message in Emacs, then hitting "s" to start a search
then "tag:<TAB>" and that tag (surrounded by quotes) was one of the
completion options. Upon completing to that tag, the search worked
fine.
Are you objecting to the unnecessary (but legal) quotes in the
completion? We might be able to include Unicode word characters in
the quoting whitelist, though that seems like a spot fix (probably a
fairly broad one, so maybe that's fine) and might be tricky because of
Emacs' somewhat weird Unicode regexp support (using [[:alpha:]] might
Just Work, but we'd have to be careful of the active syntax table).
Or tab completion could recognize that, say, tag:uni doesn't require
quoting, but still expand it to tag:"uni-köln".
More information about the notmuch
mailing list