utf-8 in author field

Carl Worth cworth at cworth.org
Fri Oct 29 17:19:07 PDT 2010


On Mon, 17 May 2010 09:56:27 +0200, Michal Sojka <sojkam1 at fel.cvut.cz> wrote:
> On Fri, 14 May 2010, Igor Shenderovich wrote:
> > What should one do to see the true list of authors?
> 
> I encounter the same when headers are not encoded properly according to
> RFC 2047. I commonly see the violation of section 5, paragraph (3),
> sentence "An 'encoded-word' MUST NOT appear within a 'quoted-string'".
> That is when the encoded word is enclosed in double quotes. I guess, the
> "problem" is not only notmuch related, but all users of gmime library
> must be affected.

Thanks for that explanation, Michal.

Igor, does that explanation seem correct for the situation you have?

> I use the following patch for notmuch to sanitize headers from a popular
> mailing list server in Czech republic:

Obviously that patch is a little too specific to be considered for
upstream notmuch. But I'm curious to know if there's anything general
that we could do in notmuch?

My guess is that the best we could do is to come up with some heuristics
for recognizing a non-RFC-compliant header here and munging it. And the
heuristics could then fail with messages that were RFC-compliant and
intentionally including a string of characters that would match the
heuristic, (which would presumably be rare, but not impossible---so
perhaps we would then need some configuration).

Anyway, if one of you could send an example of a misbehaving message, I
might like to look at it and perhaps add it to the test suite to see if
there's anything we can safely do about it.

-Carl

-- 
carl.d.worth at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20101029/10e5ea6f/attachment-0001.pgp>


More information about the notmuch mailing list