Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5
0xA0)
Comment (by Vincent Lefevre):
{{{
On 2007-09-18 07:13:35 -0000, Mutt wrote:
> Mutt tries to convert your terminal input in $charset into each of
> charsets speciefied in $send_charset. If it fails with the first one,
it
> tries the second etc, but if all of them fail, mutt uses your
terminal's
> $charset.
This last feature is a bug: this isn't documented and this isn't what
the user wants in general (if the user wants to use the terminal's
$charset, he can include it explicitly in $send_charset). There are
better solutions such as using a replacement character or returning
an error so that the user can fix the header.
> Normally this works fine, since the last item in $send_charset is utf-8
> and everything should be convertible into UTF-8.
Yes, except that even if the isspace() bug is fixed, the user may
configure $send_charset without utf-8 in it, or the user may still
generate invalid sequences for some reason.
> Personally I don't care if 0xA0 wouldn't be recognized as space -
probably
> noone uses NBSP to delimit several email addresses within recipient
list.
I don't think this should be regarded as correct anyway. But Mutt
should be consistent, and treat NBSP in the same way, whatever the
charmap is. Otherwise this may confuse those who use several locales.
I think that
return (9 <= wc && wc <= 13) || wc == 32;
would be sufficient in general, except if Mutt can also use a charmap
not based on ASCII (but I suppose that it would always be EBCDIC in
this case).
}}}
--
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:>