[PATCH] test: add known broken test for indexing html

Jeffrey Stedfast jestedfa at microsoft.com
Sun Mar 19 10:24:58 PDT 2017


> -----Original Message-----
> From: David Bremner [mailto:david at tethera.net]
> Sent: Saturday, March 18, 2017 2:15 PM
> To: Jeffrey Stedfast <jestedfa at microsoft.com>; notmuch at notmuchmail.org
> Subject: RE: [PATCH] test: add known broken test for indexing html
> 
> Jeffrey Stedfast <jestedfa at microsoft.com> writes:
> 
> > Hey David,
> >
> > I actually have an HTML tokenizer for MimeKit for (among other things)
> > this type of purpose. Perhaps I need to port that to C and include
> > that with GMime 😊
> >
> > https://github.com/jstedfast/MimeKit/tree/master/MimeKit/Text
> >
> > Jeff
> 
> That's probably a good idea in your abundant spare time ;).  More generally
> though we've thought about letting users provide filters to convert
> attachements (e.g. .odt / .docx / pdf) to text. I'm not sure about the
> performance hit, but I guess that would work for html as well.
> I guess in principle it should be possible to write GMime filter that manages
> the child process.
> 
> d


Hah, yea... it'll probably be awhile. I need to focus on GMime 3.0 first. Once I get that squared away, I can look at porting other handy features back from MimeKit 😊

Jeff



More information about the notmuch mailing list