[PATCH] test: Improve tests for the date/time parser module
Jani Nikula
jani at nikula.org
Wed Oct 3 13:32:16 PDT 2012
On Tue, 25 Sep 2012, Michal Sojka <sojkam1 at fel.cvut.cz> wrote:
> This patch reworks date/time parser library test program to make it
> easier to to write the actual tests. It also modifies the notmuch test
> script and adds several new tests to it.
Cool!
> The INPUT file for the test contains both the dates to be parsed as well
> as the "expected" results. The test program outputs the results in the
> same format and replaces expected results with real results. Currently,
> the "expected" results in the INPUT file correspond to the real results,
> so the test passes. Some results are, however, different from what I
> would expect - this is mentioned in the comments after '#'.
Please see comments inline next to tests.
>
> This patch applies on top of Jani's patchset.
> ---
> It can be seen that there are several errors and unexpected results.
> As I've already written, I'm not sure that the approach taken by this
> library is the right one. I tend to agree with mina86, that using a
> more systematic approach (such as bison) would be beneficial.
Then we just have to agree to disagree here. :)
> This is however not to say to throw this patchset away. Either Jani
> will be able to fix all the corner cases. Or we can work together to
> develop a better solution - add support for ranges to the bison
> parser.
I think I've got the cases pretty much covered, and they're mostly not
about syntax and parsing, but rather semantics; what to do with the
parsed data.
> -Michal
>
> diff --git a/test/Makefile.local b/test/Makefile.local
> index 9ae130a..b9105c7 100644
> --- a/test/Makefile.local
> +++ b/test/Makefile.local
> @@ -20,7 +20,7 @@ $(dir)/symbol-test: $(dir)/symbol-test.o
> $(call quiet,CXX) $^ -o $@ -Llib -lnotmuch $(XAPIAN_LDFLAGS)
>
> $(dir)/parse-time: $(dir)/parse-time.o parse-time-string/parse-time-string.o
> - $(call quiet,CC) $^ -o $@
> + $(call quiet,CC) $^ -o $@ -lrt
>
> .PHONY: test check
>
> diff --git a/test/parse-time-string b/test/parse-time-string
> index 34b80d7..265437c 100755
> --- a/test/parse-time-string
> +++ b/test/parse-time-string
> @@ -14,13 +14,48 @@ _parse_time ()
> ${TEST_DIRECTORY}/parse-time --format=%s "$*"
> }
>
> -test_begin_subtest "date(1) default format without TZ code"
> -test_expect_equal "$(_parse_time Fri Aug 3 23:06:06 2012)" "$(_date Fri Aug 3 23:06:06 2012)"
> +test_begin_subtest "Date parser tests"
> +cat <<EOF > INPUT
> +now -> Tue Jan 11 11:11:00 +0000 2011
> +2010-1-1 -> parse_time_string() error: 5
I think that's invalid per ISO 8601.
> +Jan 2 -> Sat Jan 02 11:11:00 +0000 2010 # Why 2010?
This is an interesting bug. The idea was that specifying a month without
year would always refer to past. So when you give Jan 2011 in the
reference time, Jan refers to the previous year. Seems simple, but
looking closer, also "Jan 2 this year" would end up in 2010. Not good.
It would probably be possible to fix this, but per the simplicity of
implementation and unambiguity goals, I think I'll just make them refer
to current year, at least for now. This may mean having to look into
supporting "last {monthname,weekday}" for better usability, but I'll
leave that as a future improvement.
> +Mon -> Mon Jan 10 11:11:00 +0000 2011
> +last Friday -> parse_time_string() error: 4
"last <weekday>" is not supported.
> +2 hours ago -> parse_time_string() error: 1
"ago" is not supported.
> +last month -> Sat Dec 11 11:11:00 +0000 2010
> +month ago -> parse_time_string() error: 1
Ditto.
> +8am -> Tue Jan 11 08:00:00 +0000 2011
> +9:15 -> Tue Jan 11 09:15:00 +0000 2011
> +12:34 -> Tue Jan 11 12:34:00 +0000 2011
> +monday -> Mon Jan 10 11:11:00 +0000 2011
> +yesterday -> Mon Jan 10 11:11:00 +0000 2011
> +tomorrow -> parse_time_string() error: 1
"tomorrow" is not supported (do you get a lot of mail from the future?
;)
> + -> Tue Jan 11 11:11:00 +0000 2011 # Shouldn't empty string return an error???
*shrug* It just starts with the reference time, and finds nothing to add
or remove. Let's call it a feature.
>
> -test_begin_subtest "date(1) --rfc-2822 format"
> -test_expect_equal "$(_parse_time Fri, 03 Aug 2012 23:07:46 +0100)" "$(_date Fri, 03 Aug 2012 23:07:46 +0100)"
> +Aug 3 23:06:06 2012 -> Fri Aug 03 23:06:06 +0000 2012 # date(1) default format without TZ code
> +Fri, 03 Aug 2012 23:07:46 +0100 -> Fri Aug 03 22:07:46 +0000 2012 # rfc-2822
> +2012-08-03 23:09:37+03:00 -> Fri Aug 03 20:09:37 +0000 2012 # rfc-3339 seconds
>
> -test_begin_subtest "date(1) --rfc=3339=seconds format"
> -test_expect_equal "$(_parse_time 2012-08-03 23:09:37+03:00)" "$(_date 2012-08-03 23:09:37+03:00)"
> +10s -> Tue Jan 11 11:10:50 +0000 2011
> +19701223s -> Wed Dec 23 11:10:59 +0000 1970 # Surprising - number is parsed as date and 's' as '1 second'
Will be fixed.
> +19701223 -> Wed Dec 23 11:11:00 +0000 1970
> +
> +19701223 +0100 -> Wed Dec 23 11:11:00 +0000 1970 # Timezone is ignored without an error
It's not ignored. Date is specified, but the time comes from the
reference time. It's the same absolute time regardless of the timezone.
> +
> +today ^-> Wed Jan 12 00:00:00 +0000 2011 # This should be 11 23:59:59
See my previous mail. Can be fixed.
> +today v-> Tue Jan 11 00:00:00 +0000 2011
> +
> +thisweek ^-> Sun Jan 16 00:00:00 +0000 2011 # This should be Sunday 23:59:59
> +thisweek v-> Sun Jan 09 00:00:00 +0000 2011 # This should be Monday 00:00:00
Implementation simplicity and dodging a localization issue. Start of the
week is based on the tm_mday field of struct tm, where 0 == Sunday. Even
if I personally agree Monday is the 1st day of the week.
> +
> +two months ago-> parse_time_string() error: 1 # Comments in the code suggest that this is supported
When in doubt, trust code over comments. ;)
> +two months -> Thu Nov 11 11:11:00 +0000 2010
> +
> +1348569850 -> parse_time_string() error: 4 # Seconds since epoch not yet supported? Backward compatibility in notmuch???
> +10 -> parse_time_string() error: 4 # Seconds since epoch?
Indeed, seconds since epoch not yet supported. There is no backwards
compatibility issue, as you can still use the
<initial-timestamp>..<final-timestamp> format as described in
notmuch-search-terms(7) man page. The new date:<since>..<until> just
doesn't support seconds since epoch yet. And I think I'll make the
syntax "@<timestamp>" to let you have "<really-big-number> seconds"
without surprises.
> +EOF
> +
> +${TEST_DIRECTORY}/parse-time --now="Tue Jan 11 11:11:00 +0000 2011" < INPUT > OUTPUT
> +test_expect_equal_file INPUT OUTPUT
>
> test_done
> diff --git a/test/parse-time.c b/test/parse-time.c
> index b4de76b..0415f49 100644
> --- a/test/parse-time.c
> +++ b/test/parse-time.c
> @@ -18,59 +18,47 @@
> * Author: Jani Nikula <jani at nikula.org>
> */
>
> +
> +#define _XOPEN_SOURCE 500 /* for strptime() and snprintf() */
> #include <getopt.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> +#include <time.h>
>
> #include "parse-time-string.h"
>
> -/*
> - * concat argv[start]...argv[end - 1], separating them by a single
> - * space, to a malloced string
> - */
> -static char *
> -concat_args (int start, int end, char *argv[])
> -{
> - int i;
> - size_t len = 1;
> - char *p;
> -
> - for (i = start; i < end; i++)
> - len += strlen (argv[i]) + 1;
> -
> - p = malloc (len);
> - if (!p)
> - return NULL;
> -
> - *p = 0;
> -
> - for (i = start; i < end; i++) {
> - if (i != start)
> - strcat (p, " ");
> - strcat (p, argv[i]);
> - }
> -
> - return p;
> -}
> -
> #define DEFAULT_FORMAT "%a %b %d %T %z %Y"
>
> static void
> usage (const char *name)
> {
> - printf ("Usage: %s [options ...] <date/time>\n\n", name);
> + printf ("Usage: %s [options ...]\n\n", name);
> printf (
> - "Parse <date/time> and display it in given format.\n\n"
> - " -f, --format=FMT output format, FMT according to strftime(3)\n"
> - " (default: \"%s\")\n"
> - " -n, --now=N use N seconds since epoch as now (default: now)\n"
> - " -u, --up round result up (default: no rounding)\n"
> - " -d, --down round result down (default: no rounding)\n"
> - " -h, --help print this help\n",
> + "Parse date/time read from stdin and display it in given format.\n\n"
> + " -f, --format=FMT output format for dates and input format for --now,\n"
> + " FMT according to strftime(3) (default: \"%s\")\n"
> + " -n, --now=N reference date in FMT (default: now)\n"
> + " -h, --help print this help\n"
> + "\n"
> + "stdin should contain one date/time per line in the following format:\n"
> + " <date/time> [ <arrow> [ comment ] ]\n"
> + "where <arrow> determines the operation performed on the <date/time>.\n"
> + "It can be one of '->', '^->', 'v->' meaning convert, convert and round\n"
> + "up, convert and round down, respectively.\n",
> DEFAULT_FORMAT);
> }
>
> +static const char *
> +get_round_str (int round)
> +{
> + switch (round) {
> + case PARSE_TIME_ROUND_UP: return "^";
> + case PARSE_TIME_ROUND_DOWN: return "v";
> + default: return "";
> + }
> +}
> +
> int
> main (int argc, char *argv[])
> {
> @@ -79,14 +67,10 @@ main (int argc, char *argv[])
> time_t result;
> time_t now;
> time_t *nowp = NULL;
> - char *argstr;
> int round = PARSE_TIME_NO_ROUND;
> - char buf[1024];
> const char *format = DEFAULT_FORMAT;
> struct option options[] = {
> { "help", no_argument, NULL, 'h' },
> - { "up", no_argument, NULL, 'u' },
> - { "down", no_argument, NULL, 'd' },
> { "format", required_argument, NULL, 'f' },
> { "now", required_argument, NULL, 'n' },
> { NULL, 0, NULL, 0 },
> @@ -111,8 +95,13 @@ main (int argc, char *argv[])
> round = PARSE_TIME_ROUND_DOWN;
> break;
> case 'n':
> - /* specify now in seconds since epoch */
> - now = (time_t) strtol (optarg, NULL, 10);
> + memset (&tm, 0, sizeof (tm));
> + char *parsed = strptime (optarg, format, &tm);
One of the problems with strptime is that it doesn't support time zones,
which is why I chose not to use it here. (You can specify %z in the
format to ignore it, but it looks like it's ignored no matter
what. *shrug*) Combined with mktime below, you introduce possible TZ and
DST variations in the tests, which can be problematic. So perhaps we
should keep the reference time as a timestamp here.
I didn't look at this test tool patch very closely yet, but in general I
like the very much increased clarity in the tests. I'll look into this
more when I have a moment.
Thanks for the tests, comments, and corner cases. They've been helpful.
BR,
Jani.
> + if (!parsed) {
> + fprintf (stderr, "Cannot parse reference date: %s\n", optarg);
> + return 1;
> + }
> + now = mktime (&tm);
> if (now >= (time_t) 0)
> nowp = &now;
> break;
> @@ -124,22 +113,47 @@ main (int argc, char *argv[])
> }
> }
>
> - argstr = concat_args (optind, argc, argv);
> - if (!argstr)
> - return 1;
> -
> - r = parse_time_string (argstr, &result, nowp, round);
> -
> - free (argstr);
> -
> - if (r)
> - return 1;
> -
> - if (!localtime_r (&result, &tm))
> - return 1;
> -
> - strftime (buf, sizeof (buf), format, &tm);
> - printf ("%s\n", buf);
> + char input[BUFSIZ];
> + while (fgets (input, BUFSIZ, stdin) && input[0]) {
> + if (input[0] == '\n') {
> + printf ("\n");
> + continue;
> + }
> + char *arrow;
> + char *comment = strrchr (input, '#');
> + arrow = strstr (input, "->");
> + round = PARSE_TIME_NO_ROUND;
> + if (arrow > input) {
> + switch (arrow[-1]) {
> + case '^': round = PARSE_TIME_ROUND_UP; arrow--; break;
> + case 'v': round = PARSE_TIME_ROUND_DOWN; arrow--; break;
> + default: break;
> + }
> + }
> + if (arrow)
> + *arrow = 0;
> + else
> + arrow = input + strlen (input); /* XXX: comment is not handled */
> + while (arrow > input && arrow[-1] == '\n')
> + arrow--;
> + *arrow-- = 0;
> +
> + r = parse_time_string (input, &result, nowp, round);
> + char resstr[BUFSIZ];
> + if (r)
> + snprintf (resstr, sizeof(resstr), "parse_time_string() error: %d", r);
> + else if (!localtime_r (&result, &tm))
> + snprintf (resstr, sizeof(resstr), "localtime(result) error");
> + else
> + strftime (resstr, sizeof (resstr), format, &tm);
> +
> + char buf[BUFSIZ];
> + snprintf (buf, sizeof(buf), "%s%s-> %s", input, get_round_str (round), resstr);
> + if (!comment)
> + printf ("%s\n", buf);
> + else
> + printf ("%-*s%s", (int)(comment - input), buf, comment);
> + }
>
> return 0;
> }
More information about the notmuch
mailing list