[PATCH] cli: notmuch-show with framing newlines between threads in JSON.

Mark Walters markwalters1009 at gmail.com
Sat Jul 7 22:30:28 PDT 2012


On Mon, 02 Jul 2012, Austin Clements <amdragon at MIT.EDU> wrote:
> Quoth myself on Jul 01 at  8:12 pm:
>> Quoth Tomi Ollila on Jul 02 at  1:13 am:
>> > On Sat, Jun 30 2012, Mark Walters <markwalters1009 at gmail.com> wrote:
>> > 
>> > > Add newlines between complete threads to make asynchronous parsing
>> > > of the JSON easier.
>> > > ---
>> > >
>> > > notmuch-pick uses the JSON output of notmuch show but, in many cases,
>> > > for many threads. This can take quite a long time when displaying a
>> > > large number of messages (say 20 seconds for the 10,000 messages in
>> > > the notmuch archive). Thus it is desirable to display results
>> > > incrementally in the same way that search currently does.
>> > >
>> > > To make this easier this patch adds newlines between each toplevel
>> > > thread. So the ouput becomes
>> > >
>> > > [
>> > > thread1
>> > > , thread2
>> > > , thread3
>> > > ...
>> > > , last_thread
>> > > ]
>> > >
>> > > Thus the parser can easily tell if it has enough data to do some more
>> > > parsing.
>> > >
>> > > Obviously, this changes the JSON output. This should not break any
>> > > consumer as the JSON parsers should not mind. However, it does break
>> > > several tests. Obviously, I will fix these but I wanted to check if
>> > > people were basically happy with the change first.
>> > 
>> > To provide this feature rather than relying on newlines the parser should
>> > use it's state to notice when one thread ends. 
>> > 
>> > Such a change could be used (privately) for human consumption -- allowing 
>> > free change of whitespace during inspection (in a debugging session or so).
>> > Computer software should not rely (or suffer) from any additional
>> > (or lack thereof) whitespace there is...
>> > 
>> > ... or at least a really convicing argument for the chance needs to
>> > be presented (before "restricting" the json output notmuch spits out).
>> 
>> Given a JSON parser that only knows how to parse complete JSON
>> expressions, it's potentially very inefficient to keep attempting to
>> parse something when you don't know if it's complete.  The newlines
>> provide an in-band framing so the consumer knows when there's a
>> complete object to be parsed.
>> 
>> In effect, this defines a super-protocol of JSON that's compatible
>> with standard JSON, but easy to incrementally parse.
>> 
>> That said, just this weekend I implemented JSON-based search with
>> incremental JSON parsing and I took a slightly different approach.  I
>> still put framing into the newlines of the search results, but rather
>> than rely on it for correctness, the consumer uses it as an
>> optimization that only hints that a complete JSON expression is
>> probably available.  If the expression turns out to be incomplete,
>> that's okay.
>> 
>> I considered building a fully-incremental JSON parser that never
>> backtracks by more than a token, which would eliminate even the cost
>> of reparsing, but if we do move to S-expressions (which I think we
>> should), we want to let Emacs' C implementation do as much of the
>> parsing as possible, and the only thing we can do with that is read a
>> complete expression.
>
> Actually, I take that back.  While we can't do fast incremental
> S-expression parsing, `parse-partial-sexp' can tell us incrementally
> (and probably very quickly) *if* there's a complete expression ready
> to parse, so we could avoid calling into the parser at all unless it
> would succeed.
>
> I'll try this out in my incremental JSON parser and see how well it
> works.

I have converted pick to use Austin's incremental parser and all works
well so this seems the way to go. Hence I have marked my original patch
obsolete.

Best wishes

Mark


More information about the notmuch mailing list