[PATCH] test: add known broken test for indexing html
Jeffrey Stedfast
jestedfa at microsoft.com
Sun Mar 19 10:24:58 PDT 2017
> -----Original Message-----
> From: David Bremner [mailto:david at tethera.net]
> Sent: Saturday, March 18, 2017 2:15 PM
> To: Jeffrey Stedfast <jestedfa at microsoft.com>; notmuch at notmuchmail.org
> Subject: RE: [PATCH] test: add known broken test for indexing html
>
> Jeffrey Stedfast <jestedfa at microsoft.com> writes:
>
> > Hey David,
> >
> > I actually have an HTML tokenizer for MimeKit for (among other things)
> > this type of purpose. Perhaps I need to port that to C and include
> > that with GMime 😊
> >
> > https://github.com/jstedfast/MimeKit/tree/master/MimeKit/Text
> >
> > Jeff
>
> That's probably a good idea in your abundant spare time ;). More generally
> though we've thought about letting users provide filters to convert
> attachements (e.g. .odt / .docx / pdf) to text. I'm not sure about the
> performance hit, but I guess that would work for html as well.
> I guess in principle it should be possible to write GMime filter that manages
> the child process.
>
> d
Hah, yea... it'll probably be awhile. I need to focus on GMime 3.0 first. Once I get that squared away, I can look at porting other handy features back from MimeKit 😊
Jeff
More information about the notmuch
mailing list