Handling mislabeled emails encoded with Windows-1252

David Bremner david at tethera.net
Mon Jul 23 18:49:23 PDT 2018


Sebastian Poeplau <sebastian.poeplau at eurecom.fr> writes:

> Hi,
>
> This email is to suggest a minor change in how notmuch handles text
> encoding when displaying emails. The motivation is the following: I keep
> receiving emails that are encoded with Windows-1252 but claim to be
> ISO 8859-1. The two character sets only differ in the range between 0x80
> and 0x9F where Windows-1252 contains special characters (e.g. “quotation
> marks”) while ISO 8859-1 only has non-printable ones. The mislabeling
> thus causes some special characters in such emails to be displayed with
> a replacement symbol for non-printable characters.

Hi Sebastian;

Everyone's mail situation is unique, but I haven't noticed this
problem. Do you have a mechanical (e.g. scripted) way of detecting such
mails? I suppose it could just look for characters in the range 0x80 to
0x95 in allegedly ISO_8859-1 messages. A census of the situation in my
own mail would help me think about this problem, I think.

David




More information about the notmuch mailing list