bug#1869: Bug#247366: mutt: segfaults on mbox with binary junk in headers

To: 1869@xxxxxxxxxxxx, 247366@xxxxxxxxxxxxxxx, mutt-dev@xxxxxxxx
Subject: bug#1869: Bug#247366: mutt: segfaults on mbox with binary junk in headers
From: TAKAHASHI Tamotsu <ttakah@xxxxxxxxxxxxxxxxx>
Date: Fri, 28 May 2004 23:44:46 +0900
In-reply-to: <20040519013133.GA18804@xxxxxxxxxxxxxxxxxxxxxxxxxx>
List-unsubscribe: <mailto:mutt-dev-request@mutt.org?body=unsubscribe>
Mail-followup-to: 1869@xxxxxxxxxxxx, 247366@xxxxxxxxxxxxxxx, mutt-dev@xxxxxxxx
References: <20040504182530.GA9962@xxxxxxxxxxxxxxxxxxxxx> <20040509090611.GA702@xxxxxxx> <20040519013133.GA18804@xxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: owner-mutt-dev@xxxxxxxx
User-agent: Mutt/1.5.6i

On Tue, May 18, 2004 at 06:31:33PM -0700, Joshua Kwan wrote:

> I confirm that this patch fixes the problem for me as well. Just like Bernd,
> I got a spam with binary junk in the header, and it segfaulted Mutt.
> Please apply it for the next Mutt upload...

Another problem has been found and fixed.
See http://www.emaillab.org/mutt/download15.html.en

adjust_line.3 patch fixes a problem displaying Chinese
characters in UTF-8 environment. (This problem does
not cause any serious symptoms like segfaults.)

For debian packagers:
Note that ``compat'' patch is equal to (assumed_charset +
adjust_edited_file + adjust_line + create_rfc2047_params).

For mutt developers:
The attached HTML file describes the detail information
of this bug, and describes why we need adjust_line patch.

-- 
tamo

Title: Mutt: Bug 1869

Bug 1869

Problem 1 (incomplete fix)

The original problem was reported to be caused by a invalid cast:

j = (int)*s;

j was size_t, s was char*. TAKIZAWA Takashi sent a patch to TAKAHASHI Tamotsu, and he reported it to the mutt BTS. The problem seemed fixed by the patch removing the cast.

Problem 2

The second problem has been found by TAKIZAWA Takashi himself. This problem affects only Chinese/Korean/Japanese languages, i.e. the languages use 0x80-0xFF range as parts of multibyte characters, in UTF-8 environment. This problem is that mutt displays smaller number of characters per line.

This is caused by removing the cast. Without the cast,

(*s < M_MAX_TREE)

is true even when *s is in 0x80-0xFF. And mutt_mbswidth() treat it as one column width.

The conditional has to check ((0 <= *s) && (*s < M_MAX_TREE)). So, this problem is fixed by a cast:

unsigned int i;

i = (unsigned int)*s;

Problem 1 (complete fix)

TAKIZAWA Takashi has found the real root of Problem 1. Before describing the detail, see this table:

	Data Length (bytes)	Display Width (columns)	Unpatched	adjust_line.1 (compat.1)	adjust_line.2	adjust_line.3
ASCII	1	1	OK	OK	OK	OK
some Japanese chars	2	1	NG	OK	OK	OK
kanji(EUC-JP)	2	2	OK	OK	OK	OK
kanji(UTF-8)	3	2	NG	OK	NG	OK

In many cases, data length is equal to column width. But, in UTF-8, kanji Chinese characters have three bytes (0x80-0xFF) per char. And they have two-column width per char. So, mutt_FormatString() has to handle the two parameters: data length and column width. mutt_mbswidth() is to calculate the latter.

TAKIZAWA Takashi tried to store the two into one variable, wlen. And this was the root of muttbug#1869. He has already written a correct patch, which uses two variables; wlen and col.

Anyway, default mutt can't handle multibyte characters correctly. Default mutt_FormatString() treats COLS, strlen() and sizeof() as both length and width. It must be a pain for multibyte people. So, try adjust_line patch.

Thomas, please include this patch. This is not only useful, but also stable and safe now.

Prev by Date: Re: [patch] add "notalternates" functionality
Next by Date: Re: [patch] add "notalternates" functionality
Previous by thread: Processed: Re: bug#1894: mutt-1.5.6i: alternates "thinks" a mail is from me which is not
Next by thread: [2004-05-29] CVS repository changes
Index(es):
- Date
- Thread