[Patch v8 07/18] unhex_and_quote: new function to quote hex-decoded queries
Jani Nikula
jani at nikula.org
Sat Dec 22 15:36:03 PST 2012
On Fri, 21 Dec 2012, david at tethera.net wrote:
> From: David Bremner <bremner at debian.org>
>
> Space delimited tokens are hex decoded and then quoted according to
> Xapian rules. Prefixes and '*' are passed through unquoted, as is
> anything that hex-decoding would not change.
> ---
> tag-util.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 81 insertions(+)
>
> diff --git a/tag-util.c b/tag-util.c
> index f89669a..46aab4e 100644
> --- a/tag-util.c
> +++ b/tag-util.c
> @@ -56,6 +56,87 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
> return NULL;
> }
>
> +/* Input is a hex encoded string, presumed to be a query for Xapian.
> + *
> + * Space delimited tokens are decoded and quoted, with '*' and prefixes
> + * of the form "foo:" passed through unquoted.
> + */
> +static tag_parse_status_t
> +unhex_and_quote (void *ctx, char *encoded, const char *line_for_error,
> + char **query_string)
> +{
> + char *tok = encoded;
> + size_t tok_len = 0;
> + char *buf = NULL;
> + size_t buf_len = 0;
> + tag_parse_status_t ret = TAG_PARSE_SUCCESS;
> +
> + *query_string = talloc_strdup (ctx, "");
> +
> + while ((tok = strtok_len (tok + tok_len, " ", &tok_len)) != NULL) {
> +
> + size_t prefix_len;
> + char delim = *(tok + tok_len);
> +
> + *(tok + tok_len++) = '\0';
You need to do tok_len++ to satisfy the next round of strtok_len, but I
think for clarity of the code below you should move tok_len++ to the end
of the while block. And review tok_len usage below.
> +
> + prefix_len = hex_invariant (tok, tok_len);
This, along with the doc comment "...initial segment of str that would
not be changed by hex encoding..." for hex_invariant, was the hardest
bit to understand. I don't follow how hex *encoding* matters here; the
input is only affected by hex *decoding*, and that depends on %NN only.
Should this be a function that counts the length of the initial segment
of str that consists of valid Xapian prefix characters and does not
contain % (I don't know if that's included in Xapian prefix
characters). In the end, the contents of the function may be (I don't
know) exactly the same as hex_invariant, but the function name would be
more self explanatory.
Does that make any sense to you...?
> +
> + if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) {
With the tok_len++ at the end, I think this should have "prefix_len ==
tok_len" for clarity.
> +
> + /* pass some things through without quoting or decoding.
> + * Note for '*' this is mandatory.
> + */
> +
> + if (! (*query_string = talloc_asprintf_append_buffer (
> + *query_string, "%s%c", tok, delim))) {
> +
> + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> + line_for_error, "aborting");
> + goto DONE;
> + }
> +
> + } else {
> + /* potential prefix: one for ':', then something after */
> + if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') {
I don't think this takes into account the tok_len++. So this should stay
as it is if you move tok_len++ to the end.
> + if (! (*query_string = talloc_strndup_append (*query_string,
> + tok,
> + prefix_len + 1))) {
> + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> + line_for_error, "aborting");
> + goto DONE;
> + }
> + tok += prefix_len + 1;
> + tok_len -= prefix_len + 1;
> + }
> +
> + if (hex_decode_inplace (tok) != HEX_SUCCESS) {
> + ret = line_error (TAG_PARSE_INVALID, line_for_error,
> + "hex decoding of token '%s' failed", tok);
> + goto DONE;
> + }
> +
> + if (double_quote_str (ctx, tok, &buf, &buf_len)) {
> + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> + line_for_error, "aborting");
> + goto DONE;
> + }
> +
> + if (! (*query_string = talloc_asprintf_append_buffer (
> + *query_string, "%s%c", buf, delim))) {
> + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> + line_for_error, "aborting");
> + goto DONE;
> + }
> + }
I think tok_len++ should be here.
BR,
Jani.
> + }
> +
> + DONE:
> + if (ret != TAG_PARSE_SUCCESS && *query_string)
> + talloc_free (*query_string);
> + return ret;
> +}
> +
> tag_parse_status_t
> parse_tag_line (void *ctx, char *line,
> tag_op_flag_t flags,
> --
> 1.7.10.4
>
> _______________________________________________
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
More information about the notmuch
mailing list