Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)

To: petr.hroudny@xxxxxxxxx, pdmef@xxxxxxx
Subject: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
From: Mutt <fleas@xxxxxxxx>
Date: Tue, 18 Sep 2007 07:13:35 -0000
Cc: mutt-dev@xxxxxxxx
In-reply-to: <035.cd543830fbb7d534dcda2ba635cf7aca@xxxxxxxx>
List-post: <mailto:mutt-dev@mutt.org>
List-unsubscribe: send mail to majordomo@mutt.org, body only "unsubscribe mutt-dev"
Mail-followup-to: fleas@xxxxxxxx
References: <035.cd543830fbb7d534dcda2ba635cf7aca@xxxxxxxx>
Reply-to: fleas@xxxxxxxx
Sender: owner-mutt-dev@xxxxxxxx

#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Comment (by phr):

 > Mutt doesn't need to recognize //TRANSLIT strings. The fact that I have
 > a //TRANSLIT string in my $charset should not have any effect on what
 > Mutt sends. $charset is only a terminal-related variable.

 Mutt tries to convert your terminal input in $charset into each of
 charsets speciefied in $send_charset. If it fails with the first one, it
 tries the second etc, but if all of them fail, mutt uses your terminal's
 $charset.

 Normally this works fine, since the last item in $send_charset is utf-8
 and everything should be convertible into UTF-8. But due to isspace() bug,
 the corrupted string is invalid even in UTF-8 so mutt thinks none of
 $send_charsets are suitable and uses your $charset with //TRANSLIT.

 Thus we really need to fix the isspace() problem - and IMHO not only in
 1.6 but some simple fix is also needed for 1.4 and 1.5. In mbyte.c we
 already have:

 {{{
 int iswspace (wint_t wc)
 {
   if (Charset_is_utf8 || charset_is_ja)
     return (9 <= wc && wc <= 13) || wc == 32;
   else
     return (0 <= wc && wc < 256) ? isspace (wc) : 0;
 }
 }}}

 So I think the easiest solution for 1.4 and 1.5 would be to write local
 isspace() function the same way - and for 1.6 consider the proper
 solution.
 Personally I don't care if 0xA0 wouldn't be recognized as space - probably
 noone uses NBSP to delimit several email addresses within recipient list.

 However, not being able to type UTF-8 characters which contain 0xA0 is a
 major problem.

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:20>

References:
- [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
  - From: Mutt

Prev by Date: mutt: 5 new changesets
Next by Date: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Previous by thread: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Next by thread: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Index(es):
- Date
- Thread