[PATCH] cli: notmuch-show with framing newlines between threads in JSON.

Austin Clements amdragon at MIT.EDU
Sun Jul 1 17:12:35 PDT 2012


Quoth Tomi Ollila on Jul 02 at  1:13 am:
> On Sat, Jun 30 2012, Mark Walters <markwalters1009 at gmail.com> wrote:
> 
> > Add newlines between complete threads to make asynchronous parsing
> > of the JSON easier.
> > ---
> >
> > notmuch-pick uses the JSON output of notmuch show but, in many cases,
> > for many threads. This can take quite a long time when displaying a
> > large number of messages (say 20 seconds for the 10,000 messages in
> > the notmuch archive). Thus it is desirable to display results
> > incrementally in the same way that search currently does.
> >
> > To make this easier this patch adds newlines between each toplevel
> > thread. So the ouput becomes
> >
> > [
> > thread1
> > , thread2
> > , thread3
> > ...
> > , last_thread
> > ]
> >
> > Thus the parser can easily tell if it has enough data to do some more
> > parsing.
> >
> > Obviously, this changes the JSON output. This should not break any
> > consumer as the JSON parsers should not mind. However, it does break
> > several tests. Obviously, I will fix these but I wanted to check if
> > people were basically happy with the change first.
> 
> To provide this feature rather than relying on newlines the parser should
> use it's state to notice when one thread ends. 
> 
> Such a change could be used (privately) for human consumption -- allowing 
> free change of whitespace during inspection (in a debugging session or so).
> Computer software should not rely (or suffer) from any additional
> (or lack thereof) whitespace there is...
> 
> ... or at least a really convicing argument for the chance needs to
> be presented (before "restricting" the json output notmuch spits out).

Given a JSON parser that only knows how to parse complete JSON
expressions, it's potentially very inefficient to keep attempting to
parse something when you don't know if it's complete.  The newlines
provide an in-band framing so the consumer knows when there's a
complete object to be parsed.

In effect, this defines a super-protocol of JSON that's compatible
with standard JSON, but easy to incrementally parse.

That said, just this weekend I implemented JSON-based search with
incremental JSON parsing and I took a slightly different approach.  I
still put framing into the newlines of the search results, but rather
than rely on it for correctness, the consumer uses it as an
optimization that only hints that a complete JSON expression is
probably available.  If the expression turns out to be incomplete,
that's okay.

I considered building a fully-incremental JSON parser that never
backtracks by more than a token, which would eliminate even the cost
of reparsing, but if we do move to S-expressions (which I think we
should), we want to let Emacs' C implementation do as much of the
parsing as possible, and the only thing we can do with that is read a
complete expression.

> Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace,
> including newlines outside strings). So I quess notmuch-pick uses something
> slightly different (probably using json.el subroutines)..
> 
> Btw2: I'm very interested to see notmuch-pick in action -- I just don't
> see this a way to do this particular support properly.
> 
> Btw3: is search is ever going to use json we'll face the same problem -- 
> unless writing each line as a separate json object (and starting to use 
> s-expressions for speed)

Done.  I'll post the patches after a little more cleanup.

> > Also, should devel/schemata be updated? It seems a little unclear as
> > this is not really a "JSON" change as the JSON does not care about the
> > newlines.
> >
> > Best wishes
> 
> and best luck with your notmuch-pick work.
> 
> >
> > Mark
> 
> Tomi


More information about the notmuch mailing list