Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5
0xA0)
Changes (by pdmef):
* owner: mutt-dev => pdmef
* status: new => assigned
* version: 1.4 => 1.5.16
* component: mutt => charset
* milestone: => 1.6
Comment:
Replying to [ticket:2956 phr]:
> When trying to send email to a recipient who's name contains the Š
character,
> mutt corrupts the recipient's fullname by omitting Š and another
character.
> I.e. recipient's fullname is Šorman but gets corrupted to oman
> There was already some bug describing strange mutt's behaviour when 0xA0
is present
> in a string - and indeed, the UTF-8 representation of Š is 0xC5 0xA0
What are your locale settings?
I can confirm that a recipient like this cannot be handled even with hg
tip.
However, a quick test showed that this is not in general the case with
multibyte characters.
Just a really really wild guess: 0xA0 is non-breaking-space (at least in
latin1) so it may be due to improper multibyte handling by simply using
issapce() or something like that.
For example, this code
{{{
#include <ctype.h>
#include <stdio.h>
int main(int argc,char** argv) {
if (!setlocale(NULL,"C")) return 1;
printf("%d\n",isspace(' '));
printf("%d\n",isspace(0xA0));
if (!setlocale(NULL,"de_DE.ISO8859-1")) return 1;
printf("%d\n",isspace(' '));
printf("%d\n",isspace(0xA0));
return 0;
}
}}}
correctly gives 1, 0, 1 and 1 on OS X.
--
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:1>