Bug: problem decoding some non-ascii characters in subjects

Sat Feb 9 02:06:58 PST 2013

On Sat, 09 Feb 2013, Albin Stjerna <albin.stjerna at gmail.com> wrote:
> Jani Nikula wrote:
>
>> On Fri, 08 Feb 2013, Albin Stjerna <albin.stjerna at gmail.com> wrote:
>> > I've been noticing that notmuch has some problems decoding certain
>> > strangely-encoded non-ascii characters in certain emails. For example,
>> > today I got this: [BIBLIST] Digitaliseringensprojektens skadliga
>> > f=?ISO-8859-1?Q?=F6rk=E4rlek_f=F6r_?= PDF-formatet (should be
>> > rendered: »Digitaliseringsprojektens skadliga förkärlek för
>> > PDF-formatet«).
>> >
>> > Apparently, some metadata is passed on to help the MUA decode the
>> > string, but notmuch doesn't seem to handle it. Entire emails can of
>> > course be supplied as needed.
>
>> Please copy paste the Subject: header directly from the message file.
>
> The exact Subject: header (from the file, not notmuch) is:
> Subject: [BIBLIST] Digitaliseringensprojektens skadliga f=?ISO-8859-1?Q?=F6rk=E4rlek_f=F6r_?= PDF-formatet

Is that entirely on one line in the original message file? If not, where
exactly is it split?

Either way, at a glance, it seems like the encoding is malformed. I
think the encoded-word ("=?" charset "?" encoding "?" encoded-text "?=")
should be separated by space to make it an atom. [RFC 2047, RFC 2822].

If you manually move the leading 'f' after the "?Q?" bit, it works as
expected. It looks like the bug is in the sender's user agent.

BR,
Jani.