[RFC] http://notmuchmail.org/searching/ [was: Re: Improving notmuch query documentation]

Austin Clements amdragon at MIT.EDU
Thu Mar 15 19:11:24 PDT 2012


Quoth Andrei POPESCU on Mar 16 at  2:30 am:
> On Jo, 15 mar 12, 17:11:08, Austin Clements wrote:
> > 
> > I think having two divergent documents covering the same thing is less
> > than ideal, but perhaps they could be merged in the near future.
> 
> I want to have this page more or less complete and descriptive. Once 
> this is done I should be able to rewrite it more like a reference.
> 
> Regarding 'notmuch help search-terms':
> 
> $ notmuch help search-terms | wc -l
> 88
> 
> IMHO that text is better suited for a manpage, the help should be just a 
> (very short) reference to refresh ones memory. What do you think?

I'm not quite sure what you mean.  That text is the man page.  Though
it sounds like a great idea to have a quick syntax reference at the
top of the manpage so it's the first thing people see when they run
'notmuch help search-terms' (and they can still scroll down to get the
details if they want).

> > A few comments:
> > 
> > The section on "Languages other than English" isn't quite correct.
> > Xapian has no idea what language is being used, so it will still stem
> > terms in other languages, but using English stemming rules.
> 
> Then I think it's safe to assume the results are very much dependent on 
> the language, so if the language has some similarities to English Xapian 
> might do some stemming.

My point is that Xapian *will* do stemming, but using English stemming
rules, whether or not the language is English.  Hence it's inaccurate
to say that text in other languages will be unstemmed.  (Also, to be
fair to Xapian, it has stemmers for a whole bunch of languages; it's
notmuch that always configures it for English.)

> > Notmuch doesn't use synonyms.
> 
> Thanks.
> 
> > It might be worth pointing out that "+term1" and "term1" are
> > equivalent.
> 
> Yes.
> 
> > "notmuch search -term2" doesn't actually work.  I've never looked in
> > to why, but I've found that Xapian ignores '-' at the beginning of a
> > query or a parenthesized expression.
> 
> Not sure what you mean here. Does Xapian just ignore the '-' and 
> searches as if it wasn't specified? I'm usually testing stuff with 

I looked at this again and realized I was slightly wrong, so I had a
discussion with Olly and dug more in the code.  At a high level, a
query is a bunch of "probs" combined with boolean operators and the
actual rule is that a prob consisting solely of a single '-' term is a
syntax error.  And if a query has a syntax error, Xapian will re-parse
the entire query without any flags (which means no boolean operators,
love/hate, phrases, or wildcards).

One upshot of this rule is that a standalone negation like
'-tag:inbox' will actually search for messages *with* tag:inbox (just
like searching for 'tag:inbox'):

$ notmuch count -- -tag:inbox
650
$ notmuch count -- tag:inbox
650
$ notmuch count -- -tag:inbox x
9905

> 'notmuch count', but I get:
> 
> $ notmuch count -Debian
> Unrecognized option: -Debian

You need to tell count that it's not an option.  The standard getopt
syntax for this works:
  notmuch count -- -Debian

> With 'search' I get results, but right now I can't think of a query to 
> test.
> 
> > "notmuch search term1 -term2" will work.
> 
> Does 'notmuch search -term1 term2' work?

'notmuch search -- -term1 term2' works.

> > In the brackets section, you'll need shell escaping for those queries
> > to work.  It might be worth pointing out the need for shell escaping
> > at the beginning.
> 
> Right, anything other than brackets and '*'?

Also double quotes.  E.g.,
  notmuch search subject:"This may look like a phrase, but don't be fooled"
will search for messages with "this" in the subject and the words
"may", "look", "like", etc anywhere (and in any order).

I think that's it for Xapian metacharacters, but of course things in
the query terms themselves could need shell escaping.  Rather than try
to think about this, I generally put my whole query in single quotes
unless it's something obviously trivial that won't contain any shell
metacharacters.

> > XOR, NEAR, and ADJ were intentionally undocumented in
> > notmuch-search-terms because they may go away some day and we don't
> > want people thinking they can depend on them.
> 
> In such case I think it's better to state so.
> 
> I'll integrate all your comments (if somebody else doesn't beat me to 
> it).
> 
> Kind regards,
> Andrei


More information about the notmuch mailing list