Re: MB_LEN_MAX has to be more than 5 if mutt uses UTF-8
* Sun Jul 13 2008 TAKAHASHI Tamotsu <ttakah@xxxxxxxxxxxxxxxxx>
> Now mutt depends on UTF-8, so MB_LEN_MAX needs to be
> at least 6. Therefore I suggest mutt.h checks it.
A multibyte guru NOZAKI-san told me that that is not enough.
You must expect errno==E2BIG even if you malloc'ed MB_LEN_MAX *
ibl bytes. Because iconv sometimes does "N:M conversion".
For example, imagine your MB_LEN_MAX is 1. This value should be
enough if you use only US-ASCII and ISO-8859-1 because they are
NOT multibyte. But some TRANSLIT locales convert
(e accent aigu) to "e'"
and
(a umlaut) to "ae"
So, MB_LEN_MAX*ibl is not enough for obl.
The above two are just "1:N" cases.
The situation is far worse in a real multibyte world.
(imagine a conversion like ascii-"ae" to utf8-a-umlaut. this
is not a real example, but you can kinda see how hard it is.)
So mutt has to handle E2BIG case with realloc.
A patch is following.
An example subject:
> =?iso-2022-jp?b?MSAbJEI3byROGyhCIE11dHQtai11c2VycyAbJEI/PUBBGyhC?=
> =?iso-2022-jp?b?GyRCMEY3byQsJCIkaiReJDkbKEI=?=
dprint'ed
> E2BIG: ibl=9, obl=2, new obl=54, safe_realloc(92)
on my MB_LEN_MAX==1 system with EUC-JP locale.
(In fact, iconv(0,0) may return E2BIG. But I didn't check it
in this patch. Some corner cases might cause overflow when
mutt _reads_ the incomplete strings, but it shouldn't be a
security hole, AFAIK.)
diff -r cc67b008038c charset.c
--- a/charset.c Fri Jul 11 11:34:42 2008 +0200
+++ b/charset.c Wed Jul 16 13:57:59 2008 +0900
@@ -391,6 +391,8 @@
ret1 = iconv (cd, &ib, &ibl, &ob, &obl);
if (ret1 != (size_t)-1)
ret += ret1;
+ else /* if (errno == E2BIG) */
+ ret = -1;
if (ibl && obl && errno == EILSEQ)
{
if (inrepls)
@@ -479,7 +481,19 @@
obl = MB_LEN_MAX * ibl;
ob = buf = safe_malloc (obl + 1);
- mutt_iconv (cd, &ib, &ibl, &ob, &obl, inrepls, outrepl);
+ /* MB_LEN_MAX may be insufficient */
+ while (mutt_iconv (cd, &ib, &ibl, &ob, &obl, inrepls, outrepl) ==
(size_t)-1)
+ {
+ if (errno != E2BIG)
+ break;
+ dprint(4, (debugfile, "mutt_convert_string E2BIG: ibl=%u, obl=%u, ",
ibl, obl));
+ len = ob - buf;
+ obl = 6 * ibl; /* XXX: "6" is a magic number */
+ dprint(4, (debugfile, "new obl=%u, safe_realloc(%u)\n", obl, len+obl+1));
+ safe_realloc (&buf, len + obl + 1);
+ ob = buf + len;
+ }
+ iconv (cd, 0, 0, &ob, &obl);
iconv_close (cd);
*ob = '\0';