[David Bremner] Re: RFC: drop html tags

David Bremner david at tethera.net
Wed Mar 22 04:12:49 PDT 2017


David Bremner <david at tethera.net> writes:

> From: David Bremner <david at tethera.net>
> Subject: Re: RFC: drop html tags
> To: Steven Allen <steven at stebalien.com>
> Date: Tue, 21 Mar 2017 14:03:10 -0300
>
> Steven Allen <steven at stebalien.com> writes:
>
>> In the JavaScript regex format, I believe the correct way to parse this is:
>>
>>     /<("[^"]*"|'[^']*'|[^"'>]*)*>/g
>>
>> Basically, while inside a tag, ignore everything between double and single quotes.
>
> Thanks for the reality check. It should be possible to handle quotes. In
> my limited understanding of that regex, we can do a bit better by
> forcing pairs of quotes to match, since I <chaos attribute="'"> is
> probably legal.

Actually, I'm wrong. My eyes just glaze over when faced with any
non-trivial regex, I guess.

d


More information about the notmuch mailing list