[PATCH v2 14/20] nmbug-status: Encode output using the user's locale

W. Trevor King wking at tremily.us
Tue Feb 11 12:11:35 PST 2014


On Tue, Feb 11, 2014 at 04:14:45PM +0200, Tomi Ollila wrote:
> On Tue, Feb 11 2014, David Bremner wrote:
> > W. Trevor King writes:
> >> Instead of always writing UTF-8, allow the user to configure the
> >> output encoding using their locale.  This is useful for
> >> previewing output in the terminal, for poor souls that don't use
> >> UTF-8 locales ;).
> >
> > …
> > remote: UnicodeEncodeError: 'ascii' codec can't encode character
> >   u'\u017b' in position 219: ordinal not in range(128)
> >
> > possibly because of
> >
> > LANG=C
> > …
> >
> > I think it's fine to _allow_ the user to configure the output
> > encoding. I'm less sure about _requiring_ it.

If a user has set LANG=C, I expect that's what we should use for
output (in which case dying with an encoding error is the right thing
to do).  If you want UTF-8 output, using a UTF-8 locale seems like a
reasonable requirement.  For the HTML case, we could fall back on
numerical character references (e.g. Ż) if the requested locale
didn't support the required character directly, but I don't see an
easy solution for the text-mode output.

> That reminded me that yesterday (after review, of course) I thought
> that we probably want configuration file to be parsed as utf-8
> instead of any encoding user may have in their system...

The POSIX spec for LANG doesn't restrict the scoping to the terminal
intput / output [1], so I feel like we should also be using LANG to
read the config file as well.  I expect folks with UTF-8 LANGs will
want UTF-8 file contents.  In both cases (terminal output and
config-file input), it is easy for users to pick their preferred
encoding:

  $ LANG=en_US.UTF-8 nmbug-status …

I think we should trust what they've chosen, rather than guessing that
they actually want UTF-8 ;).

Cheers,
Trevor

[1]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_02

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20140211/305fe131/attachment.pgp>


More information about the notmuch mailing list