Handling mislabeled emails encoded with Windows-1252
David Bremner
david at tethera.net
Mon Jul 23 18:49:23 PDT 2018
Sebastian Poeplau <sebastian.poeplau at eurecom.fr> writes:
> Hi,
>
> This email is to suggest a minor change in how notmuch handles text
> encoding when displaying emails. The motivation is the following: I keep
> receiving emails that are encoded with Windows-1252 but claim to be
> ISOÂ 8859-1. The two character sets only differ in the range between 0x80
> and 0x9F where Windows-1252 contains special characters (e.g. âquotation
> marksâ) while ISOÂ 8859-1 only has non-printable ones. The mislabeling
> thus causes some special characters in such emails to be displayed with
> a replacement symbol for non-printable characters.
Hi Sebastian;
Everyone's mail situation is unique, but I haven't noticed this
problem. Do you have a mechanical (e.g. scripted) way of detecting such
mails? I suppose it could just look for characters in the range 0x80 to
0x95 in allegedly ISO_8859-1 messages. A census of the situation in my
own mail would help me think about this problem, I think.
David
More information about the notmuch
mailing list