<<< Date Index >>>     <<< Thread Index >>>

Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)



#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Comment (by Vincent Lefevre):

 {{{
 On 2007-09-19 08:44:38 +0200, Rocco Rutte wrote:
 > * Vincent Lefevre [07-09-18 15:17:30 +0200] wrote:
 >> It doesn't necessarily make sense as the $charset may be completely
 >> local to the machine (e.g. 'x-my-charset'). I think that trying to
 >> convert the local charset to the last item of $send_charset, which
 >> should be the most general charset (e.g. utf-8), makes more sense.
 >
 > In theory I agree. But $send_charset is user configurable and
 > doesn't have to contain utf-8, it could even by empty. And still
 > then, even with utf-8 (as in your case), conversion may fail not
 > because the last item isn't generic enough but because the input is
 > invalid.
 >
 > Even in that case mutt has to do something.

 Conversion may also fail for $charset. So, Mutt has to do something in
 this case too, and I think that Mutt should use replacement characters.
 In any case, there should be a way to avoid $charset being used as a
 fallback for $send_charset, as this doesn't always makes sense.

 >> I think it is important to let the user control the fallback.
 >
 > I don't think that makes lots of sense since it's kind of
 > micro-optimization, IMHO. Because at that point, no charset did fit
 > and mutt is likely going to send out broken content anyway, so by
 > letting the user control it you only give him the control in what
 > specific way it's broken, not if it's broken at all.

 Sending a subject encoded in UTF-8 with some replacement characters
 for invalid sequences that could have occurred is much less broken
 that sending a subject using a non-standard charset (leading to
 completely-unreadable subject).

 > For the case that all conversions failed because $send_charset is
 > wrongly configured and the input is valid, $charset is the best
 > choice, so I think it's really only about the case of broken input.

 I think that utf-8 would be better than $charset as at least one knows
 that it is a standard charset (whereas $charset isn't necessarily).
 }}}

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:>