Re: Having . span newlines in a regexp
At 2004-02-05 18:49 -0500, Mike Schiraldi wrote:
> There's some discussion going on over at mutt-users regarding the behavior
> of . in mutt regular expressions. Specifically, my pattern of:
>
> color index red black '~hX-Spam-Status:.*BAYES_99'
>
> fails to match
>
> X-Spam-Status: Yes, hits=18.8 required=5.0 tests=ADDR_NUMS_AT_BIGSITE,
> BAYES_99,CLICK_BELOW_CAPS,FORGED_YAHOO_RCVD,FROM_ENDS_IN_NUMS,
> HTML_70_80,HTML_FONTCOLOR_RED,HTML_FONT_BIG,HTML_LINK_CLICK_CAPS,
> HTML_LINK_CLICK_HERE,HTML_MESSAGE,MIME_HTML_NO_CHARSET,MIME_HTML_ONLY,
> SUBJ_ALL_CAPS autolearn=no version=2.61
>
> because the . metachar doesn't span newlines.
>
> I'd be happy to implement a solution to this, but in order to increase the
> likelihood of the patch getting accepted, i'd like to ask the community for
> comments on the following potential options:
>
> - Change . so it spans newlines. Cons: Potentially not backwards compatible
> with existing patterns. Pros: It's hard to imagine a situation where someone
> wouldn't want . to span newlines, so maybe that's okay, in the interest of
> keeping things simple.
>
If '.' matched newlines then, for example, the expression
'~hFrom:.*fred' would match
From: mavis <mavis@xxxxxxxxxxx>
To: fred <fred@xxxxxxxxxxx>
which isn't what's wanted. Wouldn't it be better to leave '.' alone,
and instead combine multi-line headers into single lines before
attempting a pattern match? It should behave in the same way that
procmail does.