[PATCH] Index Content-Type of attachments with a contenttype prefix

Jani Nikula jani at nikula.org
Sat Jan 10 04:13:09 PST 2015


On Sat, 10 Jan 2015, Todd <todd at electricoding.com> wrote:
> I wanted to tag messages with calendar invitations, but couldn't as
> the information wasn't indexed.
>
> This patch allows for queries for like:
>
> Find calendar invites
> - contenttype:text/calendar or contenttype:applicaton/ics
>
> Find any image attachments
> - contenttype:image
>
> Find all patches
> - contenttype:text/x-patch
>
>
> - Todd
>
> ---
>  NEWS                               |  6 ++++++
>  completion/notmuch-completion.bash |  2 +-
>  doc/man7/notmuch-search-terms.rst  |  6 ++++++
>  emacs/notmuch.el                   |  2 +-
>  lib/database.cc                    |  1 +
>  lib/index.cc                       |  5 +++++
>  test/T190-multipart.sh             | 32 ++++++++++++++++++++++++++++++++

IMO these could be split into several patches.

>  7 files changed, 52 insertions(+), 2 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 44e8d05..5f4622c 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -15,6 +15,12 @@ keyboard shortcuts to saved searches.
>  Command-Line Interface
>  ----------------------
>
> +There is a new `contenttype:` search prefix
> +
> +  The new `contenttype:` search prefix allows searching for the
> +  content-type of attachments, which is now indexed by `notmuch
> +  insert`. See the `notmuch-search-terms` manual page for details.
> +

Admittedly I did not have the time to dig into details, but I think
"attachment" is misleading, as it's really all mime parts, right?

Will this also index the Content-Type: header of the message itself,
regardless of whether it has mime structure or not? Maybe it should?

>  Stopped `notmuch dump` failing if someone writes to the database
>
>    The dump command now takes the write lock when running. This
> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash
> index d58dc8b..05b5969 100644
> --- a/completion/notmuch-completion.bash
> +++ b/completion/notmuch-completion.bash
> @@ -61,7 +61,7 @@ _notmuch_search_terms()
>  		sed "s|^$path/||" | grep -v "\(^\|/\)\(cur\|new\|tmp\)$" ) )
>  	    ;;
>  	*)
> -	    local search_terms="from: to: subject: attachment: tag: id: thread: folder: path: date:"
> +	    local search_terms="from: to: subject: attachment: contenttype: tag: id: thread: folder: path: date:"
>  	    compopt -o nospace
>  	    COMPREPLY=( $(compgen -W "${search_terms}" -- ${cur}) )
>  	    ;;
> diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst
> index 1acdaa0..d126ce6 100644
> --- a/doc/man7/notmuch-search-terms.rst
> +++ b/doc/man7/notmuch-search-terms.rst
> @@ -40,6 +40,8 @@ indicate user-supplied values):
>
>  -  attachment:<word>
>
> +-  contenttype:<word>
> +
>  -  tag:<tag> (or is:<tag>)
>
>  -  id:<message-id>
> @@ -66,6 +68,10 @@ by including quotation marks around the phrase, immediately following
>  The **attachment:** prefix can be used to search for specific filenames
>  (or extensions) of attachments to email messages.
>
> +The **contenttype:** prefix can be used to search for specific
> +content-types of attachments to email messages (as specified by the
> +sender).
> +
>  For **tag:** and **is:** valid tag values include **inbox** and
>  **unread** by default for new messages added by **notmuch new** as well
>  as any other tag values added manually with **notmuch tag**.
> diff --git a/emacs/notmuch.el b/emacs/notmuch.el
> index 218486a..702700c 100644
> --- a/emacs/notmuch.el
> +++ b/emacs/notmuch.el
> @@ -858,7 +858,7 @@ PROMPT is the string to prompt with."
>    (lexical-let
>        ((completions
>  	(append (list "folder:" "path:" "thread:" "id:" "date:" "from:" "to:"
> -		      "subject:" "attachment:")
> +		      "subject:" "attachment:" "contenttype:")
>  		(mapcar (lambda (tag)
>  			  (concat "tag:" (notmuch-escape-boolean-term tag)))
>  			(process-lines notmuch-command "search" "--output=tags" "*")))))
> diff --git a/lib/database.cc b/lib/database.cc
> index 3601f9d..a7a64c9 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= {
>      { "from",			"XFROM" },
>      { "to",			"XTO" },
>      { "attachment",		"XATTACHMENT" },
> +    { "contenttype",		"XCONTENTTYPE"},
>      { "subject",		"XSUBJECT"},

Is the use of probabilistic prefix intentional? I think it's probably
the right thing to do, but just checking.

BR,
Jani.

>  };
>
> diff --git a/lib/index.cc b/lib/index.cc
> index 1a2e63d..c3f7c6b 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -346,6 +346,11 @@ _index_mime_part (notmuch_message_t *message,
>  	return;
>      }
>
> +    GMimeContentType*  content_type = g_mime_object_get_content_type(part);
> +    if (content_type) {
> +	_notmuch_message_gen_terms (message, "contenttype", g_mime_content_type_to_string(content_type));
> +    }
> +
>      if (GMIME_IS_MESSAGE_PART (part)) {
>  	GMimeMessage *mime_message;
>
> diff --git a/test/T190-multipart.sh b/test/T190-multipart.sh
> index 85cbf67..e3270a7 100755
> --- a/test/T190-multipart.sh
> +++ b/test/T190-multipart.sh
> @@ -104,6 +104,30 @@ Content-Transfer-Encoding: base64
>  7w0K
>  --==-=-=--
>  EOF
> +
> +cat <<EOF > content_types
> +From: Todd <todd at electricoding.com>
> +To: todd at electricoding.com
> +Subject: odd content types
> +Date: Fri, 05 Jan 2001 15:42:57 +0000
> +User-Agent: Notmuch/0.5 (http://notmuchmail.org) Emacs/23.3.1 (i486-pc-linux-gnu)
> +Message-ID: <87liy5ap01.fsf at yoom.home.cworth.org>
> +MIME-Version: 1.0
> +Content-Type: multipart/alternative; boundary="==-=-=="
> +
> +--==-=-==
> +Content-Type: application/unique_identifier
> +
> +<p>This is an embedded message, with a multipart/alternative part.</p>
> +
> +--==-=-==
> +Content-Type: text/some_other_identifier
> +
> +This is an embedded message, with a multipart/alternative part.
> +
> +--==-=-==--
> +EOF
> +cat content_types >> ${MAIL_DIR}/odd_content_type
>  notmuch new > /dev/null
>
>  test_begin_subtest "--format=text --part=0, full message"
> @@ -727,4 +751,12 @@ test_begin_subtest "html parts included"
>  notmuch show --format=json --include-html id:htmlmessage > OUTPUT
>  test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)"
>
> +test_begin_subtest "indexes content-type"
> +output=$(notmuch search contenttype:application/unique_identifier | notmuch_search_sanitize)
> +test_expect_equal "$output" "thread:XXX   2001-01-05 [1/1] Todd; odd content types (inbox unread)"
> +
> +output=$(notmuch search contenttype:text/some_other_identifier | notmuch_search_sanitize)
> +test_expect_equal "$output" "thread:XXX   2001-01-05 [1/1] Todd; odd content types (inbox unread)"
> +
> +
>  test_done
> --
> 1.9.1
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


More information about the notmuch mailing list