[PATCH] test: Add test for searching of uncommonly encoded messages

Michal Sojka sojkam1 at fel.cvut.cz
Thu Feb 23 16:33:15 PST 2012


Emails that are encoded differently than as ASCII or UTF-8 are not
indexed properly by notmuch. It is not possible to search for non-ASCII
words within those messages.
---
 test/encoding    |    9 +++++++++
 test/test-lib.sh |    5 +++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/test/encoding b/test/encoding
index 33259c1..3992b5c 100755
--- a/test/encoding
+++ b/test/encoding
@@ -21,4 +21,13 @@ irrelevant
 body}
 message}"
 
+test_begin_subtest "Search for ISO-8859-2 encoded message"
+test_subtest_known_broken
+add_message '[content-type]="text/plain; charset=iso-8859-2"' \
+            '[content-transfer-encoding]=8bit' \
+            '[subject]="ISO-8859-2 encoded message"' \
+            "[body]=$'Czech word tu\350\362\341\350\350\355 means pinguin\'s.'" # ISO-8859-2 characters are generated by shell's escape sequences
+output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
+test_expect_equal "$output" "thread:0000000000000002   2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
+
 test_done
diff --git a/test/test-lib.sh b/test/test-lib.sh
index 063a2b2..2781506 100644
--- a/test/test-lib.sh
+++ b/test/test-lib.sh
@@ -356,6 +356,11 @@ ${additional_headers}"
 ${additional_headers}"
     fi
 
+    if [ ! -z "${template[content-transfer-encoding]}" ]; then
+	additional_headers="Content-Transfer-Encoding: ${template[content-transfer-encoding]}
+${additional_headers}"
+    fi
+
     # Note that in the way we're setting it above and using it below,
     # `additional_headers' will also serve as the header / body separator
     # (empty line in between).
-- 
1.7.9.1



More information about the notmuch mailing list