[PATCH] Index Content-Type of attachments with a contenttype prefix

Todd todd at electricoding.com
Sat Jan 10 06:38:09 PST 2015


>>>>> "Jani" == Jani Nikula <jani at nikula.org> writes:

    Jani> On Sat, 10 Jan 2015, Todd <todd at electricoding.com> wrote:
    >> I wanted to tag messages with calendar invitations, but couldn't as
    >> the information wasn't indexed.
    >> 
    >> This patch allows for queries for like:
    >> 
    >> Find calendar invites
    >> - contenttype:text/calendar or contenttype:applicaton/ics
    >> 
    >> Find any image attachments
    >> - contenttype:image
    >> 
    >> Find all patches
    >> - contenttype:text/x-patch
    >> 
    >> 
    >> - Todd
    >> 
    >> ---
    >> NEWS                               |  6 ++++++
    >> completion/notmuch-completion.bash |  2 +-
    >> doc/man7/notmuch-search-terms.rst  |  6 ++++++
    >> emacs/notmuch.el                   |  2 +-
    >> lib/database.cc                    |  1 +
    >> lib/index.cc                       |  5 +++++
    >> test/T190-multipart.sh             | 32 ++++++++++++++++++++++++++++++++

    Jani> IMO these could be split into several patches.

    No problem, I'll split them up the next time I post.

    >> 7 files changed, 52 insertions(+), 2 deletions(-)
    >> 
    >> diff --git a/NEWS b/NEWS
    >> index 44e8d05..5f4622c 100644
    >> --- a/NEWS
    >> +++ b/NEWS
    >> @@ -15,6 +15,12 @@ keyboard shortcuts to saved searches.
    >> Command-Line Interface
    >> ----------------------
    >> 
    >> +There is a new `contenttype:` search prefix
    >> +
    >> +  The new `contenttype:` search prefix allows searching for the
    >> +  content-type of attachments, which is now indexed by `notmuch
    >> +  insert`. See the `notmuch-search-terms` manual page for details.
    >> +

    Jani> Admittedly I did not have the time to dig into details, but I think
    Jani> "attachment" is misleading, as it's really all mime parts, right?

    Jani> Will this also index the Content-Type: header of the message itself,
    Jani> regardless of whether it has mime structure or not? Maybe it
    Jani> should?

    Yes, all mime-parts. It does not index the Content-Type of the
    message itself.  That probably wouldn't be difficult to add if it's
    a desired feature, but if there are plans for indexing other message
    headers it may fit better there.

    I also wasn't too happy with a "contenttype" keyword and debated
    just indexing the information under "attachment" along with the
    filename.

    >> Stopped `notmuch dump` failing if someone writes to the database
    >> 
    >> The dump command now takes the write lock when running. This
    >> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash
    >> index d58dc8b..05b5969 100644
    >> --- a/completion/notmuch-completion.bash
    >> +++ b/completion/notmuch-completion.bash
    >> @@ -61,7 +61,7 @@ _notmuch_search_terms()
    >> sed "s|^$path/||" | grep -v "\(^\|/\)\(cur\|new\|tmp\)$" ) )
    >> ;;
    >> *)
    >> -	    local search_terms="from: to: subject: attachment: tag: id: thread: folder: path: date:"
    >> +	    local search_terms="from: to: subject: attachment: contenttype: tag: id: thread: folder: path: date:"
    >> compopt -o nospace
    >> COMPREPLY=( $(compgen -W "${search_terms}" -- ${cur}) )
    >> ;;
    >> diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst
    >> index 1acdaa0..d126ce6 100644
    >> --- a/doc/man7/notmuch-search-terms.rst
    >> +++ b/doc/man7/notmuch-search-terms.rst
    >> @@ -40,6 +40,8 @@ indicate user-supplied values):
    >> 
    >> -  attachment:<word>
    >> 
    >> +-  contenttype:<word>
    >> +
    >> -  tag:<tag> (or is:<tag>)
    >> 
    >> -  id:<message-id>
    >> @@ -66,6 +68,10 @@ by including quotation marks around the phrase, immediately following
    >> The **attachment:** prefix can be used to search for specific filenames
    >> (or extensions) of attachments to email messages.
    >> 
    >> +The **contenttype:** prefix can be used to search for specific
    >> +content-types of attachments to email messages (as specified by the
    >> +sender).
    >> +
    >> For **tag:** and **is:** valid tag values include **inbox** and
    >> **unread** by default for new messages added by **notmuch new** as well
    >> as any other tag values added manually with **notmuch tag**.
    >> diff --git a/emacs/notmuch.el b/emacs/notmuch.el
    >> index 218486a..702700c 100644
    >> --- a/emacs/notmuch.el
    >> +++ b/emacs/notmuch.el
    >> @@ -858,7 +858,7 @@ PROMPT is the string to prompt with."
    >> (lexical-let
    >> ((completions
    >> (append (list "folder:" "path:" "thread:" "id:" "date:" "from:" "to:"
    >> -		      "subject:" "attachment:")
    >> +		      "subject:" "attachment:" "contenttype:")
    >> (mapcar (lambda (tag)
    >> (concat "tag:" (notmuch-escape-boolean-term tag)))
    >> (process-lines notmuch-command "search" "--output=tags" "*")))))
    >> diff --git a/lib/database.cc b/lib/database.cc
    >> index 3601f9d..a7a64c9 100644
    >> --- a/lib/database.cc
    >> +++ b/lib/database.cc
    >> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= {
    >> { "from",			"XFROM" },
    >> { "to",			"XTO" },
    >> { "attachment",		"XATTACHMENT" },
    >> +    { "contenttype",		"XCONTENTTYPE"},
    >> { "subject",		"XSUBJECT"},

    Jani> Is the use of probabilistic prefix intentional? I think it's probably
    Jani> the right thing to do, but just checking.

    I'm not familiar with Xapian and just followed the precedence of
    attachment.  

    Jani> BR,
    Jani> Jani.

    - Todd


More information about the notmuch mailing list