correct way to search for only PDF attachments

Xu Wang xuwang762 at gmail.com
Mon Sep 28 21:51:01 PDT 2015


On Mon, Sep 28, 2015 at 10:00 PM, Carl Worth <cworth at cworth.org> wrote:
> On Mon, Sep 28 2015, Xu Wang wrote:
>> I would look to look for all emails from a colleague jongho. I tried:
>>
>> from:jongho attachment:pdf
>>
>> which seems to do as I wanted.
>
> Good. That should work.
>
>> To understand more, what does the following search for?
>>
>> from:jongho attachment:.*pdf
>
> Uhm, probably only strange things. There are some mechanisms for getting
> notmuch to emit some debugging information on what the final search
> terms end up being, (but I don't recall if they still require
> recompilation or not).
>
> I'm not testing now, but I wouldn't be surprised if that ended up doing
> something like searching for a phrase like "attachment pdf" anywhere
> within a message. (The Xapian parser can be somewhat unpredictable when
> you give it unexpected input.)
>
>> Also, how does the first one above know that I want only PDF
>> attachments and not an attachment called "pdformula.txt" ?
>
> It doesn't know that you want only PDF attachments. The key part is that
> the indexing is performed by breaking text up into individual terms, (at
> punctuation boundaries usually). So a search specification like
> "attachment:pdf" is searching for things that were indexed with the
> "pdf" term within the attachment prefix. So that won't match a filename
> like pdformula.txt, (which would be indexed as two terms, "pdformula"
> and "txt"), but it would match pdf.ormula.txt, (which would be indexed
> as three terms, "pdf", "ormula" and "txt").
>
> The Xapian documentation can be examined if you want more details.

This is highly useful. Thank for such an explanation!! Thank you, Carl.

Kind regards,

Xu


More information about the notmuch mailing list