locales and notmuch
marmstrong at google.com
Fri Feb 22 16:26:31 PST 2019
David Bremner <david at tethera.net> writes:
> David Bremner <david at tethera.net> writes:
>> Otherwise we could insist they are UTF-8, ignoring the locale. The
>> fullest generality (I think) is to first convert from the users locale
>> to utf8, as in the attached sample program. The gotcha is that the call
>> to setlocale is necessary, and can't really be local to string utility
>> function. So we'd have to add that to notmuch startup. We mostly ignore
>> locales, so I guess there shouldn't be too much side effects; otoh I
>> don't have much experience with locales.
> 1) It might be possible to save and restore the locale, although that
> sounds a bit heavy weight for lowercasing a string.
> 2) We'd need a UTF-8 locale to test in. I guess C.UTF-8 is not yet
> universally available.
Notmuch should probably adopt a coherent strategy with respect to
character set encodings, rather than do something ad-hoc for the
feature. Most systems I have worked with normalize to UTF-8 at the
edges and do all work using that encoding.
It is an interesting question: what encoding does .notmuch-config use?
UTF-8? User's choice? Similarly, what is the encoding of notmuch's
command line args?
I was just reading https://xapian.org/features and Xapian seems to store
text in UTF-8. If this is the case, where is the code that does the
charset conversions between the email messages and UTF-8? How about
between the command line args to UTF-8?
More information about the notmuch