<<< Date Index >>>     <<< Thread Index >>>

[PATCH] $unknown_charset feature



Hello,

    When a mail has a MIME charset label unknown to iconv, Mutt displays
it in pass-thru mode, directly unconverted to screen, in the hope the
terminal will behave well. The result can be good, when mail and
terminal charset match, and in some rare other cases (some EUC-JP
terms can perfectly display ISO-2022-JP text). Or the result can be
ugly, like nearly anything on an UTF-8 term.

    User can override pass-thru mode with charset-hooks: He can alias
some unknown labels to known charsets, as example
"charset-hook ^x-unknown-8bit$ iso-8859-1".

    But there was unfortunately no way to match *all* unknown charsets.
The patch-1.5.12.ab.unknown_charset.1 implements this, via yet another
variable, $unknown_charset, to declare the charset replacing all unknown
labels. That's a fallback: Specific charset-hooks keep full precedence.

    Usage examples:

 -a) Display everything unknown as if it was Latin-1, working equally
well in any locale:

| set unknown_charset=iso-8859-1


 -b) Display "x-unknown-8bit" as Latin-1, but all other unknown labels
safely ?-masked as if it was US-Ascii.

| charset-hook ^x-unknown-8bit$ iso-8859-1
| set unknown_charset=us-ascii


 -c) Mask everything as Ascii, but open pass-thru holes only for
EUC-JP-MS (it's unknown by stock libiconv) and 2022-JP* variants (though
well known):

| charset-hook ^euc-JPms$ pass-thru
| charset-hook ^iso-2022-jp(-[1-3])?$ pass-thru
| set unknown_charset=us-ascii

    Note "pass-thru" is not a keyword, it could as well be spelled
"blah" or whatever non-charset.

 -d) Recreate the locale-dependant pass-thru mode, but safely masking
invalid characters, so that display layout cannot be completely garbled:

| set unknown_charset=$charset


    Spelling note: Pass-thru is the US for pass-through EN, right?


    This feature patch depends on 2 previously posted bugfix patches.
They must be applied in this order:

 -1) patch-1.5.12.msyk.iconvhook.1-ab
 -2) patch-1.5.12.ab.M_ICONV_HOOK_sanitize.1
 -3) patch-1.5.12.ab.unknown_charset.1


    Historic informations on the genesis can be found in a discussion
in subthread of bug#1876 beginning roughly at:

| Date: Sun, 3 Oct 2004 00:11:08 +0200
| From: Vincent Lefevre <vincent@xxxxxxxxxx>
| Subject: Re: bug#1876: mutt-1.5.6i: Mutt doesn't handle invalid
|  characters when replying to a mail
| Message-ID: <20041002221108.GA2973@xxxxxxxxxxxxx>


Bye!    Alain.
-- 
« if you believe the Content-Type header, I've got a bridge to sell you. »

Attachment: patch-1.5.12.ab.unknown_charset.1.gz
Description: application/gunzip