<<< Date Index >>>     <<< Thread Index >>>

Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)



#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Comment (by Rocco Rutte):

 {{{
 [ could somebody please fix trac to not mangle folded subjects? :) ]

 Hi,

 * Kyle Wheeler [07-09-17 09:39:02 -0500] wrote:
 > On Monday, September 17 at 10:38 AM, quoth Vincent Lefevre:
 >> On 2007-09-16 11:59:33 -0500, David Champion wrote:
 >>>>>> Subject: Re: [Mutt] #2956: =?UTF-8//TRANSLIT?Q?Reci?=
 >>>>>>         =?UTF-8//TRANSLIT?Q?pient_address_broken_if_containing_?=
 >>>>>>         =?UTF-8//TRANSLIT?Q?=C5?= character (UTF-8 code: 0xc5 0xA0)

 >>>> I do not set $send_charset. My $charset contains //TRANSLIT, but this
 one
 >>>> is correct.

 >>> I thought that //TRANSLIT was a libiconv extension, not defined by
 spec.
 >>> It's definitely not supported by all iconv implementations, so those
 >>> wouldn't be able to parse this encoding string (regardless of
 >>> correctness).

 >> The //TRANSLIT is supported by the libiconv implementations I use here.

 > The fact that it's supported by the software that you use doesn't make
 it a
 > valid encoding to send over the internet. We have standards for a
 reason!

 I'm absolutely sure he knows this and supports your position strongly.
 That's not the point... :)

 >If mutt is generating this, then something is
 > wrong... perhaps mutt should recognize and strip out //TRANSLIT strings?

 Yes, but that's another issue (it already has a table of valid character
 set names, it just needs to match them for $send_charset).

 > Wait, so, your mutt generates an encoding that doesn't contain
 //TRANSLIT if
 > it's encoding a Euro symbol? That really is quite strange.

 Yes and no. First, this one really is ISSPACE() related I think. After
 editing a message when composing it, mutt removes trailing spaces in
 muttt_read_rfc822_line().

 As a consequence, 1) you can't write messages with a subject having a
 trailing slash (nor any other header, btw) (really, not even in us-ascii
 unless you change it in the compose menu) and 2) it's the cause why 0xc5
 0xa0 can't be encoded properly: the 0xa0 likely is removed so that mutt
 only needs to encode 0xc5.

 So technically it invalidates the UTF-8 itself and afterwards more or
 less encodes the broken subject, I think.

    bye, Rocco
 }}}

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:>