Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)

To: petr.hroudny@xxxxxxxxx, pdmef@xxxxxxx
Subject: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
From: Mutt <fleas@xxxxxxxx>
Date: Fri, 14 Sep 2007 15:57:21 -0000
Cc: mutt-dev@xxxxxxxx
In-reply-to: <035.cd543830fbb7d534dcda2ba635cf7aca@xxxxxxxx>
List-post: <mailto:mutt-dev@mutt.org>
List-unsubscribe: send mail to majordomo@mutt.org, body only "unsubscribe mutt-dev"
Mail-followup-to: fleas@xxxxxxxx
References: <035.cd543830fbb7d534dcda2ba635cf7aca@xxxxxxxx>
Reply-to: fleas@xxxxxxxx
Sender: owner-mutt-dev@xxxxxxxx

#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Comment (by pdmef):

 Replying to [comment:4 Vincent Lefevre]:

 > Mutt uses isspace() and isprint() at various places. I don't think
 > this is correct. Either Mutt needs to know what is a space or what
 > is printable on ASCII strings, in which case it should use its own
 > functions, or it needs to know such information on wide characters
 > (as this is what should be used on non-ASCII strings), in which
 > case it should use iswspace() and so on.

 Wanting to make mutt know about it seems wrong to me, as e.g. some random
 single byte locale could use 0xA0 as non-breaking-space and some random
 other could choose not to. I think this should be done by the C library
 and not by mutt.

 Making the decision at runtime whether the input is single or multibyte at
 the location of the caller also seems wrong to me as that likely means to
 write duplicate code.

 I think the only practical solution is to fix the places where single byte
 locale functions are used and input may be multibyte, e.g. isspace().

 Going with wchar_t instead of char would IMHO increase the memory
 requirements quite a lot so I don't know whether a consistent use is the
 way to go.

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:8>

References:
- [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
  - From: Mutt

Prev by Date: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Next by Date: Re: [Mutt] #2957: mutt does not handle quotes in imap folders
Previous by thread: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Next by thread: Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
Index(es):
- Date
- Thread