correct way to search for only PDF attachments

Carl Worth cworth at cworth.org
Mon Sep 28 19:00:13 PDT 2015


On Mon, Sep 28 2015, Xu Wang wrote:
> I would look to look for all emails from a colleague jongho. I tried:
>
> from:jongho attachment:pdf
>
> which seems to do as I wanted.

Good. That should work.

> To understand more, what does the following search for?
>
> from:jongho attachment:.*pdf

Uhm, probably only strange things. There are some mechanisms for getting
notmuch to emit some debugging information on what the final search
terms end up being, (but I don't recall if they still require
recompilation or not).

I'm not testing now, but I wouldn't be surprised if that ended up doing
something like searching for a phrase like "attachment pdf" anywhere
within a message. (The Xapian parser can be somewhat unpredictable when
you give it unexpected input.)

> Also, how does the first one above know that I want only PDF
> attachments and not an attachment called "pdformula.txt" ?

It doesn't know that you want only PDF attachments. The key part is that
the indexing is performed by breaking text up into individual terms, (at
punctuation boundaries usually). So a search specification like
"attachment:pdf" is searching for things that were indexed with the
"pdf" term within the attachment prefix. So that won't match a filename
like pdformula.txt, (which would be indexed as two terms, "pdformula"
and "txt"), but it would match pdf.ormula.txt, (which would be indexed
as three terms, "pdf", "ormula" and "txt").

The Xapian documentation can be examined if you want more details.

-Carl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20150928/2665d023/attachment.sig>


More information about the notmuch mailing list