<<< Date Index >>>     <<< Thread Index >>>

[PATCH] Teach mutt_FormatString() to do multibyte-aware truncation



Hi,

I'd like to ask for comments, opinions and suggestions before committing the attached patch. Oh, and for volunteers testing it... :)

It adds mutt_wstr_trunc() to determine the first N bytes of a string so that it stays within a byte and width limit, whatever limit is first hit. This is needed in mutt_FormatString():

- for padding with %*X we need to truncate the part written so far
- for all other cases where we append strings, these may get truncated

With something like that applied, I think mutt_FormatString() is mostly multibyte safe/aware.

With all 3 types of padding as well without it, it works well here, though I don't have lots of mail with multibytes...

The only thing I found out not working is padding with combined multibyte characters, but I guess that would require much more work.

  bye, Rocco
--
:wq!
diff --git a/curs_lib.c b/curs_lib.c
index 8fdb681..cb1947a 100644
--- a/curs_lib.c
+++ b/curs_lib.c
@@ -822,6 +822,38 @@ void mutt_paddstr (int n, const char *s)
     addch (' ');
 }
 
+/* See how many bytes to copy from string so its at most maxlen bytes
+ * long and maxwid columns wide */
+int mutt_wstr_trunc (const char *src, size_t maxlen, size_t maxwid, size_t 
*width)
+{
+  wchar_t wc;
+  int w = 0, l = 0, cl;
+  size_t cw, n;
+  mbstate_t mbstate;
+
+  if (!src)
+    goto out;
+
+  n = mutt_strlen (src);
+
+  memset (&mbstate, 0, sizeof (mbstate));
+  for (w = 0; n && (cl = mbrtowc (&wc, src, n, &mbstate)); src += cl, n -= cl)
+  {
+    if (cl == (size_t)(-1) || cl == (size_t)(-2))
+      cw = cl = 1;
+    else
+      cw = wcwidth (wc);
+    if (cl + l > maxlen || cw + w > maxwid)
+      break;
+    l += cl;
+    w += cw;
+  }
+out:
+  if (width)
+    *width = w;
+  return l;
+}
+
 /*
  * returns the number of bytes the first (multibyte) character
  * of input consumes:
diff --git a/muttlib.c b/muttlib.c
index ca06d27..2b7be97 100644
--- a/muttlib.c
+++ b/muttlib.c
@@ -1232,16 +1232,16 @@ void mutt_FormatString (char *dest,             /* 
output buffer */
          else if (soft && pad < 0)
          {
            /* set wptr and wlen back just enough bytes to make sure buf
-            * fits on screen, col needs no adjustments as we skip more input
-            * currently multibyte unaware */
+            * fits on screen, \0-terminate dest so mutt_wstr_trunc()
+            * can correctly compute string's length */
            if (pad < -wlen)
              pad = -wlen;
-           wlen += pad;
-           wptr += pad;
+           *wptr = 0;
+           wlen = mutt_wstr_trunc (dest, wlen + pad, col + pad, &col);
+           wptr = dest + wlen;
          }
          if (len + wlen > destlen)
-           len = destlen - wlen;
-         /* copy as much of buf as possible: multibyte unaware */
+           len = mutt_wstr_trunc (buf, destlen - wlen, COLS - col, NULL);
          memcpy (wptr, buf, len);
          wptr += len;
          wlen += len;
@@ -1304,7 +1304,7 @@ void mutt_FormatString (char *dest,               /* 
output buffer */
        }
        
        if ((len = mutt_strlen (buf)) + wlen > destlen)
-         len = (destlen - wlen > 0) ? (destlen - wlen) : 0;
+         len = mutt_wstr_trunc (buf, destlen - wlen, COLS - col, NULL);
 
        memcpy (wptr, buf, len);
        wptr += len;
diff --git a/protos.h b/protos.h
index 210ef6e..7a156b6 100644
--- a/protos.h
+++ b/protos.h
@@ -355,6 +355,7 @@ int mutt_search_command (int, int);
 int mutt_smtp_send (const ADDRESS *, const ADDRESS *, const ADDRESS *,
                     const ADDRESS *, const char *, int);
 #endif
+int mutt_wstr_trunc (const char *, size_t, size_t, size_t *);
 int mutt_charlen (const char *s, int *);
 int mutt_strwidth (const char *);
 int mutt_compose_menu (HEADER *, char *, size_t, HEADER *);