Handling mislabeled emails encoded with Windows-1252

Sebastian Poeplau sebastian.poeplau at eurecom.fr
Sat Jul 28 04:22:46 PDT 2018


Hi all,

Here's the updated patch. It filters the message through the
GMimeFilterWindows that Jeff mentioned and then uses the charset it
detects for GMimeFilterCharset in the actual rendering of the message.

Jeff, is this how to use the filter correctly?

Cheers,
Sebastian


-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix_windows_charsets.patch
Type: text/x-patch
Size: 2157 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20180728/a09d6551/attachment.bin>
-------------- next part --------------



Sebastian Poeplau <sebastian.poeplau at eurecom.fr> writes:

> Hi Jeff,
>
>> GMime actually comes with a stream filter (GMimeFilterWindows) which can auto-detect this situation.
>>
>> In this particular case, you'd instantiate the GMimeFilterWindows like this:
>>
>> filter = g_mime_filter_windows_new ("iso-8859-1");
>>
>> "iso-8859-1" being the charset that the content claims to be in.
>>
>> Then you'd pipe the raw (decoded but not converted to utf-8) content though the filter and afterward call g_mime_filter_windows_real_charset (filter) which would return, in this user's case,  "windows-1252".
>
> Nice, this is exactly what I was looking for! Somehow I missed it when
> checking GMime. I'll adapt my local fix and post the results here.
>
> Thanks,
> Sebastian


More information about the notmuch mailing list