[PATCH] nmbug: Allow Unicode tags and IDs in Python 2

W. Trevor King wking at tremily.us
Tue Feb 16 09:56:42 PST 2016


On Tue, Feb 16, 2016 at 09:04:07AM -0400, David Bremner wrote:
> W. Trevor King writes:
> > Coercing to UTF-8 (regardless of locale) gives us consistent tag
> > IDs for sharing between users.
> 
> I'm not sure what "tag IDs" are. Do you mean message-ids here? or "tags
> and IDs"?

Yeah.  I'll fix that in v2.

> At first I thought there might be problems with non-utf8 message-ids,
> but that turns out not to be the case [1].  It seems like it would take
> a fairly heroic effort to get non-UTF8 tags into the database (perhaps
> by calling the library interface with bad strings?) so we can probably
> ignore this case. It might be good to document the limitation though,
> since AFAIK, dump and restore can roundtrip any old crap.

How about in a NEWS entry in v2 of this series, and then echoing that
NEWS entry in the notmuch-dtags (or whatever) man page once I work up
that series?

> > The 'isnumeric' check identifies Unicode instances in both Python
> > 2 [9] and Python 3 [10].
> 
> I still haven't really tried to understand this part, but probably
> it deserves inline documentation.

It's just “if you have a Unicode instance (str in Python 3, unicode in
Python 2), convert it to bytes (bytes in Python 3, str in Python 2).
Only Unicode instances will have an ‘isnumeric’ method, so it's a
convenient marker for switching that logic.  I'll add a “convert from
Unicode if necessary” comment to v2.

> > I haven't checked the other commands for issues with Unicode IDs
> > or tags.  It's possible that in addition to this explicit encoding
> > to UTF-8, we'll also want explicit decoding from UTF-8 when
> > reading from Git trees (for 'nmbug checkout' and 'nmbug status').
> 
> Yes, this seems to be a problem, with the patch applied I can
> commit, but the same utf-8 message-id causes problems.

Ugh.  Thanks for checking.  I'll try to fix all the places where this
would have an impact in v2 of this series.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20160216/58b7990b/attachment.sig>


More information about the notmuch mailing list