[PATCH 00/17] nmbug-status: Python-3-compabitility and general refactoring
Tomi Ollila
tomi.ollila at iki.fi
Tue Feb 4 10:40:18 PST 2014
On Tue, Feb 04 2014, "W. Trevor King" <wking at tremily.us> wrote:
>
> >>> from __future__ import unicode_literals
> >>> import codecs
> >>> import locale
> >>> import sys
> >>> print(locale.getpreferredencoding()) # same as yours
> UTF-8
> >>> print(sys.getdefaultencoding()) # same as yours
> ascii
> >>> _ENCODING = locale.getpreferredencoding() or sys.getdefaultencoding()
> >>> print(_ENCODING) # double-check default encodings
> UTF-8
> >>> byte_stream = sys.stdout # copied from Page.write
> >>> stream = codecs.getwriter(encoding=_ENCODING)(stream=byte_stream)
> >>> data = {'from': '\u017b'} # fake the troublesome data
> >>> print(type(data['from'])) # double-check unicode_literals
> <type 'unicode'>
> >>> string = ' <td>{from}</td>\n'.format(**data)
> >>> stream.write(string)
> <td>Ż</td>
>
> It looks like you'll have the same _ENCODING as I do (UTF-8). That
> means your stream should be wrapped in a UTF-8 StreamWriter, so I
> don't understand why it's converting to ASCII. Can you run through
> the above on your troublesome machine and confirm that stream.write()
> is still raising the exception? If it doesn't work, can you just
> paste that whole run in your next email?
I don't know what to paste, so i paste this:
$ python
Python 2.6.6 (r266:84292, Nov 21 2013, 12:39:37)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> data = {'from': '\u017b'}
>>> print(type(data['from']))
<type 'str'>
>>> string = ' <td>{from}</td>\n'.format(**data)
>>> print string
<td>\u017b</td>
and then:
>>> data = {'from': u'\u017b'}
>>> print(type(data['from']))
<type 'unicode'>
>>> string = ' <td>{from}</td>\n'.format(**data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u017b' in
>>> position 0: ordinal not in range(128)
... and ...
>>> import os
>>> print os.environ['LANG']
en_US.UTF-8
> Thanks,
> Trevor
Tomi
More information about the notmuch
mailing list