problems with nmbug and empty prefix (UnicodeWarning and broken pipe)

W. Trevor King wking at tremily.us
Sat Feb 13 22:31:32 PST 2016


On Sat, Feb 13, 2016 at 10:41:40PM -0400, David Bremner wrote:
> Traceback (most recent call last):
>   File "/home/bremner/.config/scripts/nmbug.real", line 834, in <module>
>     args.func(**kwargs)
>   File "/home/bremner/.config/scripts/nmbug.real", line 324, in commit
>     status = get_status()
>   File "/home/bremner/.config/scripts/nmbug.real", line 581, in get_status
>     index = _index_tags()
>   File "/home/bremner/.config/scripts/nmbug.real", line 621, in _index_tags
>     git.stdin.write(line)

This traceback is pointing at what should be a stream write, so I
don't see how urllib is involved there at all.  I guess this traceback
ends up in the “Broken pipe” message from your original post?

Dropping some debugging prints into the:

  for line in notmuch.stdout:

block will likely get us close enough to figure out which line in the
‘notmuch dump …’ output causing the problem.

> > We only call ‘notmuch dump …’ from _index_tags, where dump's stdout is
> > tweaked and fed into ‘git update-index …’.  Your urllib UnicodeWarning
> > suggests the issue lies in:
> >
> >   tags = [
> >       _unquote(tag[len(prefix):])
> >       for tag in tags_string.split()
> >       if tag.startswith(prefix)]
> 
> Looking at the source for urllib, that line is actually in quote,
> which is called only from _hex_quote

And we call _hex_quote from _index_tags_for_message, which is right
before the git.stdin.write line from your traceback.  So its certainly
possible that we're feeding _hex_quote something it can't handle in
Python 2.  If I could reproduce this locally, I'd probably drop a
debugging print in there as well:

  for tag in tags:
      _LOG.debug('building a quoted path for {!r} / {!r}'.format(id, tag))
      path = 'tags/{id}/{tag}'.format(
          id=_hex_quote(string=id), tag=_hex_quote(string=tag))
      yield '{mode} {hash}\t{path}\n'.format(mode=mode, hash=hash, path=path)

> Unfortunately despite my best efforts with filterwarnings, I
> couldn't figure out how to get a stack trace for that
> UnicodeWarning.

I haven't spent much time with filterwarnings.  My guess is that:

  $ python -W error ./nmbug --log-level debug commit

will turn it into a raised exception [1].  But you may have tried
that, and it may not have worked for some reason :p.

If dropping debugging prints into the relevant code sections doesn't
turn up the problem, ‘strace -o /tmp/trace -f nmbug --log-level debug
commit’ will likely capture enough of the data moving between
processes for us to figure out what nmbug is choking on.

Another alternative would be to check your list of censored tags for
anything that looks like it might contain Unicode-issue-triggering
characters.  What is your locale?  Do you have any tags with non-ASCII
characters?  You should be able to isolate this problem by iterating
through all your tags:

  $ for TAG in <censored>
  > do
  >   echo "${TAG}"
  >   NMBPREFIX="${TAG%?}" nmbug commit
  > done

and see which one acts up.

Cheers,
Trevor

[1]: https://docs.python.org/2/library/warnings.html#warning-filter

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20160213/5801d487/attachment.sig>


More information about the notmuch mailing list