<<< Date Index >>>     <<< Thread Index >>>

Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)



#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Changes (by pdmef):

  * owner:  mutt-dev => pdmef
  * status:  new => assigned
  * version:  1.4 => 1.5.16
  * component:  mutt => charset
  * milestone:  => 1.6

Comment:

 Replying to [ticket:2956 phr]:

 > When trying to send email to a recipient who's name contains the Š
 character,
 > mutt corrupts the recipient's fullname by omitting Š and another
 character.

 > I.e. recipient's fullname is Šorman but gets corrupted to oman

 > There was already some bug describing strange mutt's behaviour when 0xA0
 is present
 > in a string - and indeed, the UTF-8 representation of Š is 0xC5 0xA0

 What are your locale settings?

 I can confirm that a recipient like this cannot be handled even with hg
 tip.

 However, a quick test showed that this is not in general the case with
 multibyte characters.

 Just a really really wild guess: 0xA0 is non-breaking-space (at least in
 latin1) so it may be due to improper multibyte handling by simply using
 issapce() or something like that.

 For example, this code

 {{{
 #include <ctype.h>
 #include <stdio.h>

 int main(int argc,char** argv) {
   if (!setlocale(NULL,"C")) return 1;
   printf("%d\n",isspace(' '));
   printf("%d\n",isspace(0xA0));
   if (!setlocale(NULL,"de_DE.ISO8859-1")) return 1;
   printf("%d\n",isspace(' '));
   printf("%d\n",isspace(0xA0));
   return 0;
 }
 }}}

 correctly gives 1, 0, 1 and 1 on OS X.

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:1>