<<< Date Index >>>     <<< Thread Index >>>

[PATCH] new fmtpipe updates



* On 2007.03.05, in <20070305220556.GC28253@xxxxxxxxxxxxxxxxx>,
*       "Brendan Cully" <brendan@xxxxxxxxxx> wrote:
> 
> I think I may have mentioned in the $xtitle thread, your patch doesn't
> handle things like %> which cause recursive invocations of
> mutt_FormatString. I think this would be easy to fix by having

OK, I've looked into this a bit.  I wrote the patch almost five years
ago (!) and had forgotten a lot of the use cases I had in mind at the
time, as well as the quirks surrounding the issue.

I've attached two patches, fmtpipe.2 and fmtpipe.3.  These are
mutually-exclusive alternatives to current problems.  Fmtpipe.3 takes
a somewhat new approach to the whole thing, for reasons I'll outline
below.  Fmtpipe.2 addresses previous concerns as best I can manage,
given certain constraints.  It includes:

* better determination of whether to filter the string -- instead
  of choosing to filter when '|' is encountered at EOL, we discover
  the pipe up front and operate on a copy of the src buffer with
  the pipe removed.  M_FORMAT_NOFILTER is ORed into flags to prevent
  any additional recursions that may occur.

* A pair of temporary BUFFERs are used so that mutt_extract_token() can
  find the length of the command name and properly discount it from
  line width considerations.

* as a result, %> basically works without incurring more filter
  instances.

But even now, a problem with '%> ' is that it expands into a bunch of
spaces, and passing these into mutt_create_filter() (which uses shell to
execute the filter command) essentially compresses these back into one
space.  You can avoid that by shell-quoting the argument, but then the
extra quotes are counted in line widths, too, and the final result is
mis-sized if you use %>.

So one thought is that we could either search for quotes and ignore
them, or just force-quote the argument to the command.  But this more
or less assumes that there's only one argument to the filter, and that
assumption runs against some use cases, such as the one that originally
inspired the patch.

To summarize those:
  A Breton-speaking user wanted a way of doing attributions in the
  correct Breton way, where one writes "D'al lun" for "On Monday",
  but "D'ar meurzh" for "On Tuesday".  An attribution filter for him
  would need to look at and evaluate %[%w] discretely.  It would be
  troublesome to need to parse the string passed to the filter to
  extract the %[%w] token from the full date string -- much easier just
  to look at %[%w] alone, but not to write %[%w] into the actual
  attribution text.

  Similarly, at the same time as I posted the patch, I posted
  attribution and indent_string filters that would emulate
  Gnus/Supercite attributions.  These also depend on an
  arbitrary behavior with respect to argument quoting.  (See
  <20020420190138.GA24045@xxxxxxxxxxxxxxxxx> in mutt-users for details.)

This is why fmtpipe.1 recurses when the filter output ends in '%': it
lets you pass a few tokens as arbitrary filter arguments, and the filter
emits the mutt string template that you really want evaluated.  For
example:

    set status_format="mutt-status.sh '%f' |"

    $ cat mutt-status.sh
    #!/bin/sh
    # Put the expansion of %f into the xterm title
    printf "\033]0;$1\007" >/dev/tty
    # This string (minus trailing '%') will be the actual status_format
    printf '%r %f %.6s %4l [%M/%m] %> N:%04n D:504d%%'

The final '%%' on the printf tells mutt that the filter output needs
to be evaluated once more, and that's what mutt uses as the status
text.  It's mildly complicated, but very powerful this way, and it
allows the %> to be protected.


I like this current approach well enough -- the "%>" issue can be worked
around just by emitting the real template from the filter.  But in the
case where your filter argument is the exact format string that you
really want, it adds complexity.  It would be nice if we could figure
out which way the user wants it, and win in either case.  That's what
fmtpipe.3 attempts:

1.1. If flags has M_FORMAT_NOFILTER, or if there's no pipe at the end
     of the source string, format as usual and return.

2.1. In all other cases (no M_FORMAT_NOFILTER and there is a pipe),
     use a do { mutt_extract_token } while (MoreArgs) type of loop to
     extract each quoted argument from the command line.

2.2. Iteratively format each argument as if it were the sole format for
     the entire line, and append the resulting formatted buffer to the
     command line buffer as a separate argument.

2.3. Call the filter with this command line, and read the output.

2.4. Check the output for a terminal '%' character.  If present, recurse
     mutt_FormatString() on the output buffer, as we do now, and copy
     the result to the destination buffer and return.

2.5. Otherwise, copy the filter output to the destination buffer and
     return.

I think this strategy will deal with either style, whether you want
your filter to look at discrete tokens or to look at a whole formatted
string.  For example, I can do this:

    $ cat mutt-status.sh
    #!/bin/sh
    title="$1"; shift
    printf "\033]0;$title\007" >/dev/tty
    printf "$@"

        set pager_format="mutt-status '%n' '%S%T %C: %-18.18n [%c] %?y?[%y] 
?%?X?{%X} ?%s'|"
    set status_format="mutt-status '%f' '%r %?V?/%.24V/&%.26f? %.6s %4l [%M/%m] 
%> N:%?n?%04n&----? D:%?d?%04d&----? !:%?F?%04F&----? *:%?t?%04t&----?'|"

The second quoted arguments to pager_format and status_format are the
literal format strings I want for pager and status.  But the format
I put in the first quoted argument sets the window title.  While I'm
in the pager, it shows my mailbox name.  When I view a message, it
shows the sender's name.  I could as easily set index_format to use
mutt-status and make the xterm title show subject.  These illustrate
putting an exact format string into the filter's arguments, but I can
also set a filter that generates its own format by tacking '%' onto the
output, as in the Gnus/Supercite example.

You could do other neat things too, like formatting two alternative
strings and letting the filter choose between them.

I think fmtpipe.3 addresses current concerns and is better designed and
more powerful, meeting a wider variety of use cases.  I'd like to see it
considered for commit.

-- 
 -D.    dgc@xxxxxxxxxxxx        NSIT    University of Chicago
diff -r 5fc8c7cee1dd PATCHES
--- a/PATCHES   Tue Mar 06 18:13:14 2007 -0800
+++ b/PATCHES   Wed Mar 07 13:04:59 2007 -0600
@@ -0,0 +1,1 @@
+patch-1.5.14.dgc.fmtpipe.2
diff -r 5fc8c7cee1dd mutt.h
--- a/mutt.h    Tue Mar 06 18:13:14 2007 -0800
+++ b/mutt.h    Tue Mar 06 22:45:52 2007 -0600
@@ -143,7 +143,8 @@ typedef enum
   M_FORMAT_OPTIONAL    = (1<<3),
   M_FORMAT_STAT_FILE   = (1<<4), /* used by mutt_attach_fmt */
   M_FORMAT_ARROWCURSOR = (1<<5), /* reserve space for arrow_cursor */
-  M_FORMAT_INDEX       = (1<<6)  /* this is a main index entry */
+  M_FORMAT_INDEX       = (1<<6), /* this is a main index entry */
+  M_FORMAT_NOFILTER    = (1<<7)  /* do not allow filtering on this pass */
 } format_flag;
 
 /* types for mutt_add_hook() */
diff -r 5fc8c7cee1dd muttlib.c
--- a/muttlib.c Tue Mar 06 18:13:14 2007 -0800
+++ b/muttlib.c Wed Mar 07 13:03:29 2007 -0600
@@ -996,12 +996,43 @@ void mutt_FormatString (char *dest,               /* 
   char prefix[SHORT_STRING], buf[LONG_STRING], *cp, *wptr = dest, ch;
   char ifstring[SHORT_STRING], elsestring[SHORT_STRING];
   size_t wlen, count, len, col, wid;
+  pid_t pid;
+  FILE *filter;
+  int n, dofilter = 0;
+  char *recycler;
+  char srccopy[LONG_STRING];
+  int  cmdoffset = 0;
 
   prefix[0] = '\0';
   destlen--; /* save room for the terminal \0 */
   wlen = (flags & M_FORMAT_ARROWCURSOR && option (OPTARROWCURSOR)) ? 3 : 0;
   col = wlen;
-    
+
+  if ((flags & M_FORMAT_NOFILTER) == 0)
+  {
+    n = mutt_strlen(src);
+    if (n > 0 && src[n-1] == '|')
+    {
+      BUFFER *tmp, *cmd;
+
+      dofilter = 1;
+      flags |= M_FORMAT_NOFILTER;
+      strncpy(srccopy, src, n);
+      srccopy[n-1] = '\0';
+      src = srccopy;
+
+      /* extract a command name so that we can disregard it from destlen */
+      tmp = mutt_buffer_from(NULL, src);
+      tmp->dptr = tmp->data;
+      cmd = mutt_buffer_init(NULL);
+      mutt_extract_token(cmd, tmp, 0);
+      cmdoffset = tmp->dptr - tmp->data;
+      mutt_buffer_free(&tmp);
+      mutt_buffer_free(&cmd);
+
+    }
+  }
+
   while (*src && wlen < destlen)
   {
     if (*src == '%')
@@ -1086,6 +1117,7 @@ void mutt_FormatString (char *dest,               /* 
        if (count > col)
        {
          count -= col; /* how many columns left on this line */
+         count += cmdoffset; /* add characters consumed by command name */
          mutt_FormatString (buf, sizeof (buf), src, callback, data, flags);
          len = mutt_strlen (buf);
          wid = mutt_strwidth (buf);
@@ -1196,6 +1228,41 @@ void mutt_FormatString (char *dest,              /* 
   }
   *wptr = 0;
 
+  /* Filter this string? */
+  if (dofilter)
+  {
+    wptr = dest;       /* reset write ptr */
+    wlen = (flags & M_FORMAT_ARROWCURSOR && option (OPTARROWCURSOR)) ? 3 : 0;
+    if ((pid = mutt_create_filter(dest, NULL, &filter, NULL)))
+    {
+      n = fread(dest, 1, destlen /* already decremented */, filter);
+      fclose(filter);
+      dest[n] = '\0';
+      if (pid != -1)
+       mutt_wait_filter(pid);
+
+      /* If the result ends with '%', this indicates that the filter
+       * generated %-tokens that mutt can expand.  Eliminate the '%'
+       * marker and recycle the string through mutt_FormatString().
+       * To literally end with "%", use "%%". */
+      if (dest[--n] == '%')
+      {
+       dest[n] = '\0';         /* remove '%' */
+       if (dest[--n] != '%')
+       {
+         recycler = safe_strdup(dest);
+         mutt_FormatString(dest, destlen++, recycler, callback, data, flags);
+         safe_free(&recycler);
+       }
+      }
+    }
+    else
+    {
+      /* Filter failed; erase write buffer */
+      *wptr = '\0';
+    }
+  }
+
 #if 0
   if (flags & M_FORMAT_MAKEPRINT)
   {
diff -r 5fc8c7cee1dd PATCHES
--- a/PATCHES   Tue Mar 06 18:13:14 2007 -0800
+++ b/PATCHES   Wed Mar 07 13:06:01 2007 -0600
@@ -0,0 +1,1 @@
+patch-1.5.14.dgc.fmtpipe.3
diff -r 5fc8c7cee1dd mutt.h
--- a/mutt.h    Tue Mar 06 18:13:14 2007 -0800
+++ b/mutt.h    Tue Mar 06 22:45:52 2007 -0600
@@ -143,7 +143,8 @@ typedef enum
   M_FORMAT_OPTIONAL    = (1<<3),
   M_FORMAT_STAT_FILE   = (1<<4), /* used by mutt_attach_fmt */
   M_FORMAT_ARROWCURSOR = (1<<5), /* reserve space for arrow_cursor */
-  M_FORMAT_INDEX       = (1<<6)  /* this is a main index entry */
+  M_FORMAT_INDEX       = (1<<6), /* this is a main index entry */
+  M_FORMAT_NOFILTER    = (1<<7)  /* do not allow filtering on this pass */
 } format_flag;
 
 /* types for mutt_add_hook() */
diff -r 5fc8c7cee1dd muttlib.c
--- a/muttlib.c Tue Mar 06 18:13:14 2007 -0800
+++ b/muttlib.c Wed Mar 07 14:34:36 2007 -0600
@@ -996,12 +996,86 @@ void mutt_FormatString (char *dest,               /* 
   char prefix[SHORT_STRING], buf[LONG_STRING], *cp, *wptr = dest, ch;
   char ifstring[SHORT_STRING], elsestring[SHORT_STRING];
   size_t wlen, count, len, col, wid;
+  pid_t pid;
+  FILE *filter;
+  int n;
+  char *recycler;
 
   prefix[0] = '\0';
   destlen--; /* save room for the terminal \0 */
   wlen = (flags & M_FORMAT_ARROWCURSOR && option (OPTARROWCURSOR)) ? 3 : 0;
   col = wlen;
-    
+
+  if ((flags & M_FORMAT_NOFILTER) == 0)
+  {
+    n = mutt_strlen(src);
+    if (n > 0 && src[n-1] == '|')
+    {
+      BUFFER *srcbuf, *word, *command;
+      char    srccopy[LONG_STRING];
+      FILE *fp = fopen("/tmp/ass", "w");
+
+      strncpy(srccopy, src, n);
+      srccopy[n-1] = '\0';
+
+      /* prepare BUFFERs */
+      srcbuf = mutt_buffer_from(NULL, srccopy);
+      srcbuf->dptr = srcbuf->data;
+      word = mutt_buffer_init(NULL);
+      command = mutt_buffer_init(NULL);
+
+      /* Iterate expansions across successive arguments */
+      do {
+        /* Extract the command name and copy to command line */
+        mutt_extract_token(word, srcbuf, 0);
+        mutt_buffer_addch(command, '\'');
+        mutt_FormatString(buf, sizeof(buf), word->data, callback, data,
+                          flags | M_FORMAT_NOFILTER);
+        mutt_buffer_addstr(command, buf);
+        mutt_buffer_addch(command, '\'');
+        mutt_buffer_addch(command, ' ');
+      } while (MoreArgs(srcbuf));
+
+      fwrite(command->data, 1, mutt_strlen(command->data), fp);
+
+      wptr = dest;      /* reset write ptr */
+      wlen = (flags & M_FORMAT_ARROWCURSOR && option (OPTARROWCURSOR)) ? 3 : 0;
+      if ((pid = mutt_create_filter(command->data, NULL, &filter, NULL)))
+      {
+        n = fread(dest, 1, destlen /* already decremented */, filter);
+        fclose(filter);
+        dest[n] = '\0';
+        if (pid != -1)
+          mutt_wait_filter(pid);
+  
+        /* If the result ends with '%', this indicates that the filter
+         * generated %-tokens that mutt can expand.  Eliminate the '%'
+         * marker and recycle the string through mutt_FormatString().
+         * To literally end with "%", use "%%". */
+        if (dest[--n] == '%')
+        {
+          dest[n] = '\0';               /* remove '%' */
+          if (dest[--n] != '%')
+          {
+            recycler = safe_strdup(dest);
+            mutt_FormatString(dest, destlen++, recycler, callback, data, 
flags);
+            safe_free(&recycler);
+          }
+        }
+      }
+      else
+      {
+        /* Filter failed; erase write buffer */
+        *wptr = '\0';
+      }
+
+      mutt_buffer_free(&command);
+      mutt_buffer_free(&srcbuf);
+      mutt_buffer_free(&word);
+      return;
+    }
+  }
+
   while (*src && wlen < destlen)
   {
     if (*src == '%')
@@ -1196,6 +1270,41 @@ void mutt_FormatString (char *dest,              /* 
   }
   *wptr = 0;
 
+  /* Filter this string? */
+  if (0)
+  {
+    wptr = dest;       /* reset write ptr */
+    wlen = (flags & M_FORMAT_ARROWCURSOR && option (OPTARROWCURSOR)) ? 3 : 0;
+    if ((pid = mutt_create_filter(dest, NULL, &filter, NULL)))
+    {
+      n = fread(dest, 1, destlen /* already decremented */, filter);
+      fclose(filter);
+      dest[n] = '\0';
+      if (pid != -1)
+       mutt_wait_filter(pid);
+
+      /* If the result ends with '%', this indicates that the filter
+       * generated %-tokens that mutt can expand.  Eliminate the '%'
+       * marker and recycle the string through mutt_FormatString().
+       * To literally end with "%", use "%%". */
+      if (dest[--n] == '%')
+      {
+       dest[n] = '\0';         /* remove '%' */
+       if (dest[--n] != '%')
+       {
+         recycler = safe_strdup(dest);
+         mutt_FormatString(dest, destlen++, recycler, callback, data, flags);
+         safe_free(&recycler);
+       }
+      }
+    }
+    else
+    {
+      /* Filter failed; erase write buffer */
+      *wptr = '\0';
+    }
+  }
+
 #if 0
   if (flags & M_FORMAT_MAKEPRINT)
   {