Exporting a single email as JSON

Sat Dec 10 20:29:50 PST 2011

Just to add to Jameson's email...

Quoth Ciprian Dorin Craciun on Dec 11 at 12:46 am:
> On Sat, Dec 10, 2011 at 22:15, Jameson Graef Rollins
> <jrollins at finestructure.net> wrote:
> > On Sat, 10 Dec 2011 20:32:22 +0200, Ciprian Dorin Craciun <ciprian.craciun at gmail.com> wrote:
> >>     Quick question: why isn't it reasonable to export a **single**
> >> email in JSON format (by using the `show` sub-command)? (I mean I
> >> understand that in order to be able to correctly parse the output we
> >> need only one "object" (i.e. a list of threads, containing a list of
> >> emails, etc.). But there might be use cases in which we need a
> >> "twist".)
> >
> > Hi, Ciprian.  I agree that it would be nice too have the ability to
> > output single messages without the rest of their thread.  I have on
> > occasion wanted this functionality, but never enough to get around to
> > implementing it.  It definitely wouldn't be that hard to implement,
> > though.
> >
> > The notmuch show function is actually going through a pretty major
> > overhaul at the moment.  I bet as soon as that's done we can get some
> > sort of single-message output going.
> >
> > jamie.
> 
> 
>     I've given a quick look into `notmuch-show.c` (commit from
> December 4) and indeed it seems quite trivial to add new formats.

I think it might make sense for formats to accept options that
fine-tune their output in orthogonal ways, rather than guessing what
consumers need.

However, I don't think adding *new* formats is the way to go.  We need
to be careful to limit the formats in order to prevent divergence.
There's a lot of information notmuch show could include in its output
and the few existing formats already include very different subsets of
this information.  We don't want to get into a situation where, say,
the array-JSON format evolves to includes one thing while the
line-broken-JSON format evolves to includes another and consumers have
to choose based on the information they need and not on what's easiest
for them to consume.

>     I think it's quite hard to get this feature "right". I.e. I can
> see the following different -- but equally likely -- use-cases:
>     * in my use-case I would need each line of the output to be a
> standalone JSON object of an individual message; (thus I can script
> with Bash `notmuch ... | while read message ; do ... ; done`;)

As Jameson mentioned, similar things have been discussed in the
context of notmuch search.  And the motivation there is related: we
want it to be easy to consume one result at a time, which means it
needs to be easy to know when the input is complete enough to pass to
a JSON parser.  In the case of show, this doesn't have to be at odds
with the existing format; we can leave the giant array for consumers
that don't need the complexity of streaming, but ensure that newlines
only appear between top-level array elements and nowhere else,
providing an in-band framing for streaming consumers.  I'm not sure
how you would do this with show, given its nested structure.