[PATCH 00/17] nmbug-status: Python-3-compabitility and general refactoring

W. Trevor King wking at tremily.us
Tue Feb 4 08:11:42 PST 2014


On Tue, Feb 04, 2014 at 12:30:30PM +0200, Tomi Ollila wrote:
> On Tue, Feb 04 2014, "W. Trevor King" wrote:
> > On Mon, Feb 03, 2014 at 11:10:23PM +0200, Tomi Ollila wrote:
> >>   File "devel/nmbug/nmbug-status", line 197, in _write_threads
> >>     ).format(**message_display_data))
> >>   File "/usr/lib64/python2.6/codecs.py", line 351, in write
> >>     data, consumed = self.encode(object, self.errors)
> >> UnicodeEncodeError: 'ascii' codec can't encode character u'\u017b' in
> >>   position 176: ordinal not in range(128)
> >
> > Hmm.  __future__'s unicode_literals should be giving us a Unicode
> > target, so I'm not sure why we'd have trouble injecting Unicode.  This
> > works fine for me on Python 2.7 and 3.3.  Maybe you just have a funky
> > encoding?  …
> 
> LANG=en_US.UTF-8
> all other LC_* variables en_US.UTF-8 except
> LC_TIME=en_GB.utf8
> LC_ALL empty (naturally)
> 
> python -c 'import locale; print(locale.getpreferredencoding())'
> UTF-8
> python -c 'import sys; print(sys.getdefaultencoding())'
> ascii

That's very strange.  On Python 2.6.9, with the same encodings:

  >>> from __future__ import unicode_literals
  >>> import codecs
  >>> import locale
  >>> import sys
  >>> print(locale.getpreferredencoding())  # same as yours
  UTF-8
  >>> print(sys.getdefaultencoding())  # same as yours
  ascii
  >>> _ENCODING = locale.getpreferredencoding() or sys.getdefaultencoding()
  >>> print(_ENCODING)  # double-check default encodings
  UTF-8
  >>> byte_stream = sys.stdout  # copied from Page.write
  >>> stream = codecs.getwriter(encoding=_ENCODING)(stream=byte_stream)
  >>> data = {'from': '\u017b'}  # fake the troublesome data
  >>> print(type(data['from']))  # double-check unicode_literals
  <type 'unicode'>
  >>> string = '  <td>{from}</td>\n'.format(**data)
  >>> stream.write(string)
    <td>Ż</td>

It looks like you'll have the same _ENCODING as I do (UTF-8).  That
means your stream should be wrapped in a UTF-8 StreamWriter, so I
don't understand why it's converting to ASCII.  Can you run through
the above on your troublesome machine and confirm that stream.write()
is still raising the exception?  If it doesn't work, can you just
paste that whole run in your next email?

Thanks,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20140204/5b72876f/attachment.pgp>


More information about the notmuch mailing list