Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)

To: petr.hroudny@xxxxxxxxx, pdmef@xxxxxxx
Subject: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
From: Mutt <fleas@xxxxxxxx>
Date: Tue, 18 Sep 2007 08:17:42 -0000
Cc: mutt-dev@xxxxxxxx
In-reply-to: <035.cd543830fbb7d534dcda2ba635cf7aca@xxxxxxxx>
List-post: <mailto:mutt-dev@mutt.org>
List-unsubscribe: send mail to majordomo@mutt.org, body only "unsubscribe mutt-dev"
Mail-followup-to: fleas@xxxxxxxx
References: <035.cd543830fbb7d534dcda2ba635cf7aca@xxxxxxxx>
Reply-to: fleas@xxxxxxxx
Sender: owner-mutt-dev@xxxxxxxx

#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Comment (by Vincent Lefevre):

 {{{
 On 2007-09-18 07:13:35 -0000, Mutt wrote:
 >  Mutt tries to convert your terminal input in $charset into each of
 >  charsets speciefied in $send_charset. If it fails with the first one,
 it
 >  tries the second etc, but if all of them fail, mutt uses your
 terminal's
 >  $charset.

 This last feature is a bug: this isn't documented and this isn't what
 the user wants in general (if the user wants to use the terminal's
 $charset, he can include it explicitly in $send_charset). There are
 better solutions such as using a replacement character or returning
 an error so that the user can fix the header.

 >  Normally this works fine, since the last item in $send_charset is utf-8
 >  and everything should be convertible into UTF-8.

 Yes, except that even if the isspace() bug is fixed, the user may
 configure $send_charset without utf-8 in it, or the user may still
 generate invalid sequences for some reason.

 >  Personally I don't care if 0xA0 wouldn't be recognized as space -
 probably
 >  noone uses NBSP to delimit several email addresses within recipient
 list.

 I don't think this should be regarded as correct anyway. But Mutt
 should be consistent, and treat NBSP in the same way, whatever the
 charmap is. Otherwise this may confuse those who use several locales.
 I think that

   return (9 <= wc && wc <= 13) || wc == 32;

 would be sufficient in general, except if Mutt can also use a charmap
 not based on ASCII (but I suppose that it would always be EBCDIC in
 this case).
 }}}

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:>

References:
- [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
  - From: Mutt

Prev by Date: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Next by Date: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Previous by thread: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Next by thread: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Index(es):
- Date
- Thread