<<< Date Index >>>     <<< Thread Index >>>

Re: mutt/2560: Mutt chokes on invalid charset in UTF environment



The following reply was made to PR mutt/2560; it has been noted by GNATS.

From: Alain Bench <veronatif@xxxxxxx>
To: bug-any@xxxxxxxxxxxxx
Cc: 
Subject: Re: mutt/2560: Mutt chokes on invalid charset in UTF environment
Date: Fri, 8 Dec 2006 11:36:50 +0100 (CET)

 Hello Christian! Sorry to reply late... Overbooked, huge backlog, sorry.
 
  On Tuesday, November 21, 2006 at 12:40:12 +0100, Christian Ebert wrote:
 
 > Mutt chokes on attached spam message with charset=iso-8859-8-i.
 
     This ISO-8859-8-i is an official IANA-registred charset. This -i
 variant contains the same Hebrew characters as ISO-8859-8, coded the
 same. The only difference is in the direction of writing lines: Bidir
 layout already done, or having to be done. Libiconv doesn't know the -i
 variant, but it seems safe to alias it:
 
 | charset-hook ^iso-8859-8-i$ iso-8859-8
 
     With it, you should see the characters properly converted, and no
 more crashes, hopefully. You might see right-to-left lines reversed,
 though, I suppose...
 
     A second level problem would be that it's an html mail: Your browser
 has to either know -8-i, or to have it aliased to straight -8.
 
 
 > The otherwise (in an UTF-environment) indispensable assumed-charset
 > patch is of no help here.
 
     No help, because $assumed_charset doesn't apply at all to this case.
 We are in the proper MIME label but unknown by iconv case. By default
 Mutt does pass-thru display, no conversion. That case could at large be
 under control of the $unknown_charset patch (perhaps with
 $unknown_charset=us-ascii), if the proposed specific charset-hook was
 not yet more appropriate.
 
     I don't know what exactly freezes/crashes (no problem here under
 Linux), but it's probably neither Mutt, nor Libiconv (not used anymore
 after a failed iconv_open()). Libc functions chocking on invalid
 characters could be better suspects. For Mutt current design, this is an
 unavoidable problem, as long as we do pass-thru mode everywhere...
 Indeed assumed/unknown/other charset patches permit to optionally avoid
 this pass-thru mode, and can be good solutions in user's hands.
 
 
 Bye!   Alain.
 -- 
 Software should be written to deal with every conceivable error
        RFC 1122 / Robustness Principle