[PATCH] cli: notmuch-show with framing newlines between threads in JSON.

Tomi Ollila tomi.ollila at iki.fi
Mon Jul 2 05:29:16 PDT 2012


On Mon, Jul 02 2012, Austin Clements <amdragon at MIT.EDU> wrote:

> Quoth myself on Jul 01 at  8:12 pm:
>> Quoth Tomi Ollila on Jul 02 at  1:13 am:
>> > On Sat, Jun 30 2012, Mark Walters <markwalters1009 at gmail.com> wrote:
>> > 
>> > > Add newlines between complete threads to make asynchronous parsing
>> > > of the JSON easier.
>> > > ---
>> > >
>> > > notmuch-pick uses the JSON output of notmuch show but, in many cases,
>> > > for many threads. This can take quite a long time when displaying a
>> > > large number of messages (say 20 seconds for the 10,000 messages in
>> > > the notmuch archive). Thus it is desirable to display results
>> > > incrementally in the same way that search currently does.
>> > >
>> > > To make this easier this patch adds newlines between each toplevel
>> > > thread. So the ouput becomes
>> > >
>> > > [
>> > > thread1
>> > > , thread2
>> > > , thread3
>> > > ...
>> > > , last_thread
>> > > ]
>> > >
>> > > Thus the parser can easily tell if it has enough data to do some more
>> > > parsing.
>> > >
>> > > Obviously, this changes the JSON output. This should not break any
>> > > consumer as the JSON parsers should not mind. However, it does break
>> > > several tests. Obviously, I will fix these but I wanted to check if
>> > > people were basically happy with the change first.
>> > 
>> > To provide this feature rather than relying on newlines the parser should
>> > use it's state to notice when one thread ends. 
>> > 
>> > Such a change could be used (privately) for human consumption -- allowing 
>> > free change of whitespace during inspection (in a debugging session or so).
>> > Computer software should not rely (or suffer) from any additional
>> > (or lack thereof) whitespace there is...
>> > 
>> > ... or at least a really convicing argument for the chance needs to
>> > be presented (before "restricting" the json output notmuch spits out).
>> 
>> Given a JSON parser that only knows how to parse complete JSON
>> expressions, it's potentially very inefficient to keep attempting to
>> parse something when you don't know if it's complete.  The newlines
>> provide an in-band framing so the consumer knows when there's a
>> complete object to be parsed.
>> 
>> In effect, this defines a super-protocol of JSON that's compatible
>> with standard JSON, but easy to incrementally parse.
>> 
>> That said, just this weekend I implemented JSON-based search with
>> incremental JSON parsing and I took a slightly different approach.  I
>> still put framing into the newlines of the search results, but rather
>> than rely on it for correctness, the consumer uses it as an
>> optimization that only hints that a complete JSON expression is
>> probably available.  If the expression turns out to be incomplete,
>> that's okay.
>> 
>> I considered building a fully-incremental JSON parser that never
>> backtracks by more than a token, which would eliminate even the cost
>> of reparsing, but if we do move to S-expressions (which I think we
>> should), we want to let Emacs' C implementation do as much of the
>> parsing as possible, and the only thing we can do with that is read a
>> complete expression.
>
> Actually, I take that back.  While we can't do fast incremental
> S-expression parsing, `parse-partial-sexp' can tell us incrementally
> (and probably very quickly) *if* there's a complete expression ready
> to parse, so we could avoid calling into the parser at all unless it
> would succeed.
>
> I'll try this out in my incremental JSON parser and see how well it
> works.

I played a bit with parse-partial-sexp (and sexp-at-point) and it looks
like this really could work. IMO the things to be done could be

1) add --format=sexp support to notmuch cli
2) add tests for that (I can do some tests)
3) review those patches (I will definitely be one reviewer)
4) convert emacs to use --format=sexp everywhere it now uses --format=json,
   (is it then basically s/(json-read)/(sexp-at-point)/ ?) and adjust tests.
5) review those patches (I will definitely be one reviewer)
6) add support to commands lines like 
   'notmuch search --output=sexp --sort=oldest-first tag:unread' ...
   (even I can do that) (of course --output=json would work too as expected)
7) review...
8) convert emacs notmuch search to use that syntax and incrementally
   display progress.
9) review...

This means that we would drop adding new features using json output in
emacs mua and concentrate using sexps wherever applicable.

(during this we should be able to determine whether those framing newlines
 are needed or not)

Tomi

PS: A week ago I also did some experiments how notmuch cli could spit sexp
output using current json formatters -- just naiively copying those and
modifying would result like 99% of copy-paste and 1% of changes. I got some
initial thoughts how thihgs could be done but luckily Peter & Austin are
ahead of me -- I now just eagerly wait for patches to be reviewed :D


More information about the notmuch mailing list