<<< Date Index >>>     <<< Thread Index >>>

bug#1869: Bug#247366: mutt: segfaults on mbox with binary junk in headers



On Tue, May 18, 2004 at 06:31:33PM -0700, Joshua Kwan wrote:

> I confirm that this patch fixes the problem for me as well. Just like Bernd,
> I got a spam with binary junk in the header, and it segfaulted Mutt.
> Please apply it for the next Mutt upload...

Another problem has been found and fixed.
See http://www.emaillab.org/mutt/download15.html.en

adjust_line.3 patch fixes a problem displaying Chinese
characters in UTF-8 environment. (This problem does
not cause any serious symptoms like segfaults.)

For debian packagers:
Note that ``compat'' patch is equal to (assumed_charset +
adjust_edited_file + adjust_line + create_rfc2047_params).

For mutt developers:
The attached HTML file describes the detail information
of this bug, and describes why we need adjust_line patch.

-- 
tamo
Title: Mutt: Bug 1869

Bug 1869

Problem 1 (incomplete fix)

The original problem was reported to be caused by a invalid cast:

j = (int)*s;

j was size_t, s was char*. TAKIZAWA Takashi sent a patch to TAKAHASHI Tamotsu, and he reported it to the mutt BTS. The problem seemed fixed by the patch removing the cast.

Problem 2

The second problem has been found by TAKIZAWA Takashi himself. This problem affects only Chinese/Korean/Japanese languages, i.e. the languages use 0x80-0xFF range as parts of multibyte characters, in UTF-8 environment. This problem is that mutt displays smaller number of characters per line.

This is caused by removing the cast. Without the cast,

(*s < M_MAX_TREE)

is true even when *s is in 0x80-0xFF. And mutt_mbswidth() treat it as one column width.

The conditional has to check ((0 <= *s) && (*s < M_MAX_TREE)). So, this problem is fixed by a cast:

unsigned int i;
i = (unsigned int)*s;

Problem 1 (complete fix)

TAKIZAWA Takashi has found the real root of Problem 1. Before describing the detail, see this table:

Data Length (bytes)Display Width (columns)Unpatchedadjust_line.1 (compat.1)adjust_line.2adjust_line.3
ASCII11OKOKOKOK
some Japanese chars21NGOKOKOK
kanji(EUC-JP)22OKOKOKOK
kanji(UTF-8)32NGOKNGOK

In many cases, data length is equal to column width. But, in UTF-8, kanji Chinese characters have three bytes (0x80-0xFF) per char. And they have two-column width per char. So, mutt_FormatString() has to handle the two parameters: data length and column width. mutt_mbswidth() is to calculate the latter.

TAKIZAWA Takashi tried to store the two into one variable, wlen. And this was the root of muttbug#1869. He has already written a correct patch, which uses two variables; wlen and col.

Anyway, default mutt can't handle multibyte characters correctly. Default mutt_FormatString() treats COLS, strlen() and sizeof() as both length and width. It must be a pain for multibyte people. So, try adjust_line patch.

Thomas, please include this patch. This is not only useful, but also stable and safe now.