UTF-8 in mail headers (namely FROM) sent by bugzilla

Jani Nikula jani at nikula.org
Fri Aug 9 11:04:47 PDT 2013


On Fri, 09 Aug 2013, stedfast at comcast.net wrote:
> Hi guys, 
>
> ( I'm the author of GMime for those that don't know) 
>
> I just came across the notmuch thread (with the referenced Subject)
> but unfortunately am not subscribed to the mailing list and so am
> unable to reply to the list (hopefully no one minds me emailing them
> directly!). I wanted to reach out and offer a possible solution to the
> problem being discussed.

Thanks for your mail; hopefully you don't mind me replying to the list!

> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
> *should* solve the decoding problem mentioned in the thread. This flag
> should be safe to pass into g_mime_init() without any bad side effects
> and my unit tests do test that code-path.

Many thanks, this solves my issue with the subject lines.

This is the quick patch I tried:

diff --git a/notmuch.c b/notmuch.c
index 78d29a8..7300c21 100644
--- a/notmuch.c
+++ b/notmuch.c
@@ -264,7 +264,7 @@ main (int argc, char *argv[])
 
     local = talloc_new (NULL);
 
-    g_mime_init (0);
+    g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
 #if !GLIB_CHECK_VERSION(2, 35, 1)
     g_type_init ();
 #endif

We'll need to look into using this in the lib too.

BR,
Jani.


> I took a look at gmime-filter-headers.[c,h] as well and I suspect that
> it was written back when GMime brokenly did not guarantee UTF-8
> decoded strings from functions like g_mime_message_get_subject() and
> the like. This was fixed a while back. From a quick grep of the
> ChangeLog it looks like this was probably fixed in 2.5.9 or so (but
> possibly as late as 2.6.3 as there were some other charset rfc2047
> decoder fixes around then).
>
> I know for sure that the 2.4.x series didn't guarantee UTF-8-safe
> strings, but it's been the goal of 2.6.x to make that guarantee (minus
> any bugs that may exist, but if you find any cases of that, let me
> know!)
>
> (Note: raw header values from g_mime_object_get_header() are not
> guaranteed to be UTF-8 but if you call
> g_mime_utils_header_decode_text/phrase() on them, the results are
> guaranteed to be valid UTF-8)
>
> Hope that helps, 
>
> Jeff 


More information about the notmuch mailing list