web interface to notmuch

Tue Oct 31 11:47:34 PDT 2017

Hi Matthew

Sorry for just chiming in here out of the blue. I don't really know
anything on the code you are discussing, but I have some experience with
python.

Matthew Lear <matt at bubblegen.co.uk> writes:

> Traceback (most recent call last):
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 239, in
> process
>     return self.handle()
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 230, in
> handle
>     return self._delegate(fn, self.fvars, args)
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 420, in
> _delegate
>     return handle_class(cls)
>   File "/usr/lib/python2.7/dist-packages/web/application.py", line 396, in
> handle_class
>     return tocall(*args)
>   File "/b/git/notmuch-brians.git/contrib/notmuch-web/nmweb.py", line 153,
> in GET
>     sprefix=webprefix)
>   File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 989,
> in render
>     return self.environment.handle_exception(exc_info, True)
>   File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 754,
> in handle_exception
>     reraise(exc_type, exc_value, tb)
>   File "templates/show.html", line 1, in top-level template code
>     {% extends "base.html" %}
>   File "templates/base.html", line 32, in top-level template code
>     {% block content %}
>   File "templates/show.html", line 12, in block "content"
>     {% for part in format_message(m.get_filename(),mid): %}{{ part|safe
> }}{% endfor %}
>   File "/b/git/notmuch-brians.git/contrib/notmuch-web/nmweb.py", line 245,
> in format_message_walk
>     tags=safe_tags).encode(part.get_content_charset('ascii')))

My guess is that the function part.get_content_charset is requesting the
encoding used for a message, providing 'ascii' as a backup if not found.
It is getting 'latin-1', which is hence tried for encoding output.

> UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c'
> in position 1141: ordinal not in range(256)

Here is an interactive python session to reproduce:

>>> u = u'\u201c'
>>> u
u'\u201c'
>>> type(u)
<type 'unicode'> # (un-encoded)
>>> u.encode('utf-8')
'\xe2\x80\x9c'   # utf-8 for encoding work fine
>>> print u.encode('utf-8')
“
>>> print u.encode('latin-1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in position 0: ordinal not in range(256)

The character is not encodable with latin-1. So one should check that
the function getting the encoding is doing a proper job and if so blame
the message information.

Just my 2 cents

Best regards
--
Tomas