Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)
#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5
0xA0)
Comment (by pdmef):
Replying to [comment:4 Vincent Lefevre]:
> Mutt uses isspace() and isprint() at various places. I don't think
> this is correct. Either Mutt needs to know what is a space or what
> is printable on ASCII strings, in which case it should use its own
> functions, or it needs to know such information on wide characters
> (as this is what should be used on non-ASCII strings), in which
> case it should use iswspace() and so on.
Wanting to make mutt know about it seems wrong to me, as e.g. some random
single byte locale could use 0xA0 as non-breaking-space and some random
other could choose not to. I think this should be done by the C library
and not by mutt.
Making the decision at runtime whether the input is single or multibyte at
the location of the caller also seems wrong to me as that likely means to
write duplicate code.
I think the only practical solution is to fix the places where single byte
locale functions are used and input may be multibyte, e.g. isspace().
Going with wchar_t instead of char would IMHO increase the memory
requirements quite a lot so I don't know whether a consistent use is the
way to go.
--
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:8>