[Patch v5 3/6] util: add gz_readline

Tomi Ollila tomi.ollila at iki.fi
Wed Apr 2 09:43:07 PDT 2014


On Wed, Apr 02 2014, Austin Clements <amdragon at MIT.EDU> wrote:

> Quoth David Bremner on Apr 01 at 10:16 pm:
>> The idea is to provide a more or less drop in replacement for readline
>> to read from zlib/gzip streams.  Take the opportunity to replace
>> malloc with talloc.
>> ---
>>  util/Makefile.local |  2 +-
>>  util/util.h         | 12 +++++++++
>>  util/zlib-extra.c   | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  util/zlib-extra.h   | 11 ++++++++
>>  4 files changed, 100 insertions(+), 1 deletion(-)
>>  create mode 100644 util/util.h
>>  create mode 100644 util/zlib-extra.c
>>  create mode 100644 util/zlib-extra.h
>> 
>> diff --git a/util/Makefile.local b/util/Makefile.local
>> index 29c0ce6..e2a5b65 100644
>> --- a/util/Makefile.local
>> +++ b/util/Makefile.local
>> @@ -4,7 +4,7 @@ dir := util
>>  extra_cflags += -I$(srcdir)/$(dir)
>>  
>>  libutil_c_srcs := $(dir)/xutil.c $(dir)/error_util.c $(dir)/hex-escape.c \
>> -		  $(dir)/string-util.c $(dir)/talloc-extra.c
>> +		  $(dir)/string-util.c $(dir)/talloc-extra.c $(dir)/zlib-extra.c
>>  
>>  libutil_modules := $(libutil_c_srcs:.c=.o)
>>  
>> diff --git a/util/util.h b/util/util.h
>> new file mode 100644
>> index 0000000..8663cfc
>> --- /dev/null
>> +++ b/util/util.h
>> @@ -0,0 +1,12 @@
>> +#ifndef _UTIL_H
>> +#define _UTIL_H
>> +
>> +typedef enum util_status {
>> +    UTIL_SUCCESS = 0,
>> +    UTIL_ERROR = 1,
>> +    UTIL_OUT_OF_MEMORY,
>> +    UTIL_EOF,
>> +    UTIL_FILE,
>> +} util_status_t;
>> +
>> +#endif
>> diff --git a/util/zlib-extra.c b/util/zlib-extra.c
>> new file mode 100644
>> index 0000000..cb1eba0
>> --- /dev/null
>> +++ b/util/zlib-extra.c
>> @@ -0,0 +1,76 @@
>> +/* zlib-extra.c -  Extra or enhanced routines for compressed I/O.
>> + *
>> + * Copyright (c) 2014 David Bremner
>> + *
>> + * This program is free software: you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation, either version 3 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see http://www.gnu.org/licenses/ .
>> + *
>> + * Author: David Bremner <david at tethera.net>
>> + */
>> +
>> +#include "zlib-extra.h"
>> +#include <talloc.h>
>> +#include <stdio.h>
>> +#include <string.h>
>> +
>> +/* mimic POSIX/glibc getline, but on a zlib gzFile stream, and using talloc */
>> +util_status_t
>> +gz_getline (void *talloc_ctx, char **bufptr, size_t *bufsiz, ssize_t *bytes_read,
>
> Talloc chunks know their size, so rather than taking bufsize, use
> talloc_get_size (or talloc_array_length if you switch to talloc array
> functions below).

Now yoy David have a chance to drop the bufsiz argument altogether, as the
info is available in *bufptr:s talloc context...

>
>> +	    gzFile stream)
>> +{
>> +    size_t len = *bufsiz;
>> +    char *buf = *bufptr;
>> +    size_t offset = 0;
>> +
>> +    if (len == 0 || buf == NULL) {
>> +	/* same as getdelim from gnulib */
>> +	len = 120;
>
> This is presumably because glibc's malloc has an 8 byte header.  Fun
> fact: talloc has a 104 byte header (on 64-bit and including the malloc
> header).

hmm, what should we choose here? 152 ? Some bikeshedding on IRC ?

>> +	buf = talloc_size (talloc_ctx, len);
>> +	if (buf == NULL)
>> +	    return UTIL_OUT_OF_MEMORY;
>> +    }
>> +
>> +    while (1) {
>> +	if (! gzgets (stream, buf + offset, len - offset)) {
>> +	    int zlib_status = 0;
>> +	    (void) gzerror (stream, &zlib_status);
>> +	    switch (zlib_status) {
>> +	    case Z_OK:
>> +		/* follow getline behaviour */
>> +		*bytes_read = -1;
>
> Is this really what getline does when the last line of a file isn't
> \n-terminated?

Maybe the previous call returned non-\n -terminated string and
for this call there was 0 bytes left to return ???

Tomi

>> +		return UTIL_EOF;
>> +		break;
>> +	    case Z_ERRNO:
>> +		return UTIL_FILE;
>> +		break;
>> +	    default:
>> +		return UTIL_ERROR;
>> +	    }
>> +	}
>> +
>> +	offset += strlen (buf + offset);
>> +
>> +	if ( buf[offset - 1] == '\n' )
>
> Too many spaces!
>
>> +	    break;
>> +
>> +	len *= 2;
>> +	buf = talloc_realloc (talloc_ctx, buf, char, len);
>
> Or talloc_realloc_size, to match the initial talloc_size.
> Alternatively, the initial talloc_size could be a talloc_array.
>
>> +	if (buf == NULL)
>> +	    return UTIL_OUT_OF_MEMORY;
>> +    }
>> +
>> +    *bufptr = buf;
>> +    *bufsiz = len;
>> +    *bytes_read = offset;
>> +    return UTIL_SUCCESS;
>> +}
>> diff --git a/util/zlib-extra.h b/util/zlib-extra.h
>> new file mode 100644
>> index 0000000..ed46ac1
>> --- /dev/null
>> +++ b/util/zlib-extra.h
>> @@ -0,0 +1,11 @@
>> +#ifndef _ZLIB_EXTRA_H
>> +#define _ZLIB_EXTRA_H
>> +
>> +#include <zlib.h>
>> +#include "util.h"
>
> I'd put "util.h" first so we're more likely to catch missing header
> dependencies (obviously util.h doesn't have any right now, but in the
> future).
>
> Also, I'd put a blank line after the #includes.
>
>> +/* Like getline, but read from a gzFile. Allocation is with talloc */
>> +util_status_t
>> +gz_getline (void *ctx, char **lineptr, size_t *line_size, ssize_t *bytes_read,
>> +	    gzFile stream);
>> +
>> +#endif


More information about the notmuch mailing list