<<< Date Index >>>     <<< Thread Index >>>

Re: Replying to a specific message



On Wed, Aug 27, 2008 at 6:59 PM, Kyle Wheeler <kyle-mutt@xxxxxxxxxxxxxx> wrote:
> On Wednesday, August 27 at 06:10 PM, quoth Shreevatsa R:
>> The only "parsing" that the pattern parser needs to do is break up
>> the user-input string into "logical parts" of the form (~i EXPR),
>> (~s EXPR), etc., and then each corresponding part can do its job
>> with the actual string that was input. (The parser might also have
>> to do some logical AND/OR/NOT operations, but that's later.) It has
>> no reason to poke at the actual EXPR strings and mess with them. The
>> parser already has a () logical grouping operator; I maintain that
>> this is all it needs. By simply refusing to interpret
>> ~i foo\ and\ bar
>> as one argument to ~i and instead requiring either ~i "foo and bar" or
>> ~i (foo and bar), it can completely do away with the extra dequoting
>> step and pass to the regex engine the same strings it gets, intact. It
>> spends an extra step parsing the string and dequoting all characters,
>> when all it needs to do is something much simpler.
>
> So under your system, if I say:
>
>     ~i "foo and bar"
>
> Should the regex engine see:
>
>     foo and bar
>
> ...or should the regex engine see:
>
>     "foo and bar"
>
> In other words, should the quotes (or parentheses) be stripped out?

As quotes would be useful for grouping, a reasonable choice IMHO would
be to strip them out before passing them to the regex engine.

> If not, how can I match something that doesn't have quotes in it (such
> as the Message-ID)?
>
> If they should, how can I match something that DOES have quotes in it?

Use \". If you want the regex engine to get
"foo and bar"
then you can get it with ~i "\"foo and bar\"". This would not be much
different from the current implementation, which *does* strip out the
quotes and requires one to use \" anyway.

I'm not claiming that this is the best choice. If this is undesirable
for some reason, then some other character could be used for grouping
that is less likely to be used in regexes, e.g. requiring (foo and
bar) for no quotes to regex engine, and ("foo and bar") for quotes to
go. Or the "parser" (whose job really is to split-and-despatch) could
have a policy of only stripping off the outermost level that is
absolutely necessary for grouping.

Clearly, some thinking is required, to see what's most convenient for
the user. But that thinking is worth it. For a program as popular as
mutt, one hour of developer thought, on even the smallest issue, might
be equivalent to hundreds of hours of user thought. The moment someone
instead of doing this thinking says "Oh, I'll just use an extra level
of the standard parser and let the user deal with it", it's a bad
sign.

>> Anyway, nevermind this rant; back to the original question: Given
>> that mutt has no syntax for specifying that a string must be
>> interpreted literally (an annoying omission), "escaping" is
>> necessary. Simply replacing each $ with \\$ (or with [$]) works for
>> everything I have encountered, so the question is entirely
>> hypothetical: what more might I have to worry about, and has anyone
>> thought about the problem of escaping a string for mutt and solved
>> it? There is no consistency to speak of among regular expression
>> implementations; each has its own quirks; and simply using another
>> language's "escape" function might not work with mutt.
>
> THAT is a far more useful question... I don't have a good generic
> answer for you, though.

Fair enough, I wasn't really expecting one to exist anyway.

Thanks to everyone for the answers to my original question, and
suggested improvements.

-S