[PATCH] rewriting notmuch-search for structured output to make other output formats easier
Austin Clements
amdragon at MIT.EDU
Sat Jan 21 16:23:01 PST 2012
Quoth Peter Feigl on Jan 22 at 12:17 am:
> On Sat, 21 Jan 2012 17:04:07 -0500, Austin Clements <amdragon at MIT.EDU> wrote:
> > Quoth Peter Feigl on Jan 21 at 10:16 pm:
> > I think this is a great idea and I'm a fan of having an S-expression
> > format, but I also think there's a much easier and more general way to
> > structure this.
> >
> > In particular, I don't think you should hijack search_format, since
> > you'll wind up repeating most of this work for anything else that
> > needs to output structured data (notmuch show, at least). Rather, I'd
> > suggest creating a general "structure printer" struct that isn't tied
> > to search. You've essentially already done this, you just called it
> > search_format. Then, leave the existing format callbacks in place,
> > just use this new API instead of lots of printf calls.
>
> I'm sorry I haven't been more clear about this, the intention was all
> along to check whether this would be ok in notmuch-search, and if it got
> accepted there, to factor it out into a separate file and then use it in
> notmuch-show and notmuch-reply. There's nothing in the printer (except
> for the name of the struct) that ties it to search.
Great! However, it seems counterproductive to put all of these
structures and functions into search only to then move them somewhere
else. Wouldn't it make more sense to introduce the generic structure
printer in a first patch, then refactor the existing code to use it in
a second (or maybe more) patch in a series?
> > What about all of those annoying {tag,item}_{start,sep,end} fields? I
> > think you can simultaneously get rid of those *and* simplify the
> > structure printer API. If the structure printer is allowed to keep a
> > little state, it can automatically insert separators. With a little
> > nesting state, it could even insert terminators by just saying "pop me
> > to this level". This could probably be better, but I'm imagining
> > something like
>
> I agree, however this is complicated by the fact that there are
> additional restrictions on the actually printed code: newlines should be
> placed at strategic locations to improve parsability, which could be
> hard to decide in the printer alone without support from the logic that
> drives it.
True, but that support is easy to add and, I would argue, the exact
implementation of framing *should* be up to the printer (though
obviously where to place the framing should be up to the caller).
Following the general structure of the API I was proposing, you could
add a function like
/* Print a framing break that is easy to detect in a parser. */
void
sprinter_break (struct sprinter *sp);
that for JSON and S-expressions could just print a newline. Or, for
JSON, if we want separators to appear before the newline (which they
don't have to; we can choose our framing however we want), it could
simply record that a newline should be printed after the next
separator or terminator.
> > struct sprinter *
> > new_json_sprinter (const void *ctx, FILE *stream);
> > struct sprinter *
> > new_sexp_sprinter (const void *ctx, FILE *stream);
> >
> > /* Start a map (a JSON object or a S-expression alist/plist/whatever)
> > * and return the nesting level of the map. */
> > int
> > sprinter_map (struct sprinter *sp);
> > /* Start a list (aka array) and return the nesting level of the list. */
> > int
> > sprinter_list (struct sprinter *sp);
> >
> > /* Close maps and lists until reaching level. */
> > void
> > sprinter_pop (struct sprinter *sp, int level);
> >
> > void
> > sprinter_map_key (struct sprinter *sp, const char *key);
> > void
> > sprinter_number (struct sprinter *sp, int val);
> > void
> > sprinter_string (struct sprinter *sp, const char *val);
> > void
> > sprinter_bool (struct sprinter *sp, notmuch_bool_t val);
> >
> > and that's it. This would also subsume your format_attribute_*
> > helpers.
> >
> > Unfortunately, it's a pain to pass things like a structure printer
> > object around formatters (too bad notmuch isn't written in C++, eh?).
> > I think it's better to address this than to structure around it.
> > Probably the simplest thing to do is to make a struct for formatter
> > state and pass that in to the callbacks. You could also more
> > completely emulate classes, but that would probably be overkill for
> > this.
>
> I believe this approach is similar to the one I've implemented (though
> probably higher level, not so many details explicitly written into the
> formatting code). I would suggest trying to get any sort of structured
*nods* It was definitely inspired by yours. The complexity has to go
somewhere, and in the long run it seems it would be better to isolate
it in the printer than to deal with it in every caller. My ulterior
motive was to maintain roughly the existing and extensible callback
structure while eliminating the need for the various start/sep/end
fields, which would have to differ between the formats and would
otherwise duplicate knowledge that should be isolated to a structure
printer.
> formatters (whether more like your suggestions or like the thing I
> implemented doesn't matter so much) into the main codebase, and then
> refactoring the other parts to use it. I've thought about providing a
> single patch to all of notmuch that accomplishes this, but I've deemed
> it too large and complicated to be accepted, I thought limiting it to
> notmuch-search would be a way to get it set up, so that it could be
> expanded to the other parts later.
I've found that smaller patches are much more likely to get reviewed
on the notmuch list, even if they're part of a series that adds up to
a big change.
> Thanks for the comments, I'll keep thinking about your design, a very
> interesting idea!
>
> Peter
More information about the notmuch
mailing list