[PATCH] cli: notmuch-show with framing newlines between threads in JSON.

Austin Clements amdragon at MIT.EDU
Sun Jul 1 20:52:41 PDT 2012


Quoth myself on Jul 01 at  8:12 pm:
> Quoth Tomi Ollila on Jul 02 at  1:13 am:
> > On Sat, Jun 30 2012, Mark Walters <markwalters1009 at gmail.com> wrote:
> > 
> > > Add newlines between complete threads to make asynchronous parsing
> > > of the JSON easier.
> > > ---
> > >
> > > notmuch-pick uses the JSON output of notmuch show but, in many cases,
> > > for many threads. This can take quite a long time when displaying a
> > > large number of messages (say 20 seconds for the 10,000 messages in
> > > the notmuch archive). Thus it is desirable to display results
> > > incrementally in the same way that search currently does.
> > >
> > > To make this easier this patch adds newlines between each toplevel
> > > thread. So the ouput becomes
> > >
> > > [
> > > thread1
> > > , thread2
> > > , thread3
> > > ...
> > > , last_thread
> > > ]
> > >
> > > Thus the parser can easily tell if it has enough data to do some more
> > > parsing.
> > >
> > > Obviously, this changes the JSON output. This should not break any
> > > consumer as the JSON parsers should not mind. However, it does break
> > > several tests. Obviously, I will fix these but I wanted to check if
> > > people were basically happy with the change first.
> > 
> > To provide this feature rather than relying on newlines the parser should
> > use it's state to notice when one thread ends. 
> > 
> > Such a change could be used (privately) for human consumption -- allowing 
> > free change of whitespace during inspection (in a debugging session or so).
> > Computer software should not rely (or suffer) from any additional
> > (or lack thereof) whitespace there is...
> > 
> > ... or at least a really convicing argument for the chance needs to
> > be presented (before "restricting" the json output notmuch spits out).
> 
> Given a JSON parser that only knows how to parse complete JSON
> expressions, it's potentially very inefficient to keep attempting to
> parse something when you don't know if it's complete.  The newlines
> provide an in-band framing so the consumer knows when there's a
> complete object to be parsed.
> 
> In effect, this defines a super-protocol of JSON that's compatible
> with standard JSON, but easy to incrementally parse.
> 
> That said, just this weekend I implemented JSON-based search with
> incremental JSON parsing and I took a slightly different approach.  I
> still put framing into the newlines of the search results, but rather
> than rely on it for correctness, the consumer uses it as an
> optimization that only hints that a complete JSON expression is
> probably available.  If the expression turns out to be incomplete,
> that's okay.
> 
> I considered building a fully-incremental JSON parser that never
> backtracks by more than a token, which would eliminate even the cost
> of reparsing, but if we do move to S-expressions (which I think we
> should), we want to let Emacs' C implementation do as much of the
> parsing as possible, and the only thing we can do with that is read a
> complete expression.

Actually, I take that back.  While we can't do fast incremental
S-expression parsing, `parse-partial-sexp' can tell us incrementally
(and probably very quickly) *if* there's a complete expression ready
to parse, so we could avoid calling into the parser at all unless it
would succeed.

I'll try this out in my incremental JSON parser and see how well it
works.


More information about the notmuch mailing list