<<< Date Index >>>     <<< Thread Index >>>

Re: Having . span newlines in a regexp



At 2004-02-05 18:49 -0500, Mike Schiraldi wrote:
> There's some discussion going on over at mutt-users regarding the behavior
> of . in mutt regular expressions. Specifically, my pattern of:
> 
> color index red black '~hX-Spam-Status:.*BAYES_99'
> 
> fails to match
> 
> X-Spam-Status: Yes, hits=18.8 required=5.0 tests=ADDR_NUMS_AT_BIGSITE,
>         BAYES_99,CLICK_BELOW_CAPS,FORGED_YAHOO_RCVD,FROM_ENDS_IN_NUMS,
>         HTML_70_80,HTML_FONTCOLOR_RED,HTML_FONT_BIG,HTML_LINK_CLICK_CAPS,
>         HTML_LINK_CLICK_HERE,HTML_MESSAGE,MIME_HTML_NO_CHARSET,MIME_HTML_ONLY,
>         SUBJ_ALL_CAPS autolearn=no version=2.61
> 
> because the . metachar doesn't span newlines.
> 
> I'd be happy to implement a solution to this, but in order to increase the
> likelihood of the patch getting accepted, i'd like to ask the community for
> comments on the following potential options:
> 
> - Change . so it spans newlines. Cons: Potentially not backwards compatible
> with existing patterns. Pros: It's hard to imagine a situation where someone
> wouldn't want . to span newlines, so maybe that's okay, in the interest of
> keeping things simple.
> 

If '.' matched newlines then, for example, the expression
'~hFrom:.*fred' would match

  From: mavis <mavis@xxxxxxxxxxx>
  To: fred <fred@xxxxxxxxxxx>

which isn't what's wanted.  Wouldn't it be better to leave '.' alone,
and instead combine multi-line headers into single lines before
attempting a pattern match?  It should behave in the same way that
procmail does.