<<< Date Index >>>     <<< Thread Index >>>

Re: charset fixing at display time



Hello Jeff,

 On Tuesday, December 30, 2003 at 5:55:09 PM -0500, Jeff Abrahamson wrote:

> Some email arrive with eight bit characters that are not interpreted
> correctly. Mostly, these are punctuation: octal 222 for single quote,
> octal 223 and 224 for double quotes, octal 226 for dash.

    Those ’ “ ” – are CP-1252 chars you don't have on a Latin-1
terminal. You can see them perfectly on a CP-1252 or UTF-8 terminal
though. Switching to such a term is the best solution.


> One way to handle this is to have my text/plain viewer make the
> modifications. This is what I am currently doing.

    If you want to stay with Latin-1 term, better drop viewer and
$display_filter and experiment iconv transliterations: Append
"//TRANSLIT" to your $charset declaration in muttrc, even if void.


> Another way is to get the senders to reform their ways and do proper
> charset tagging instead of none or us-ascii.

    Yes that's a good practice. I recommend spanking. In the meantime
you can correct false or no label by aliasing them, and tell Mutt which
charset non-MIME messages are:

| charset-hook ^us-ascii$   windows-1252
| charset-hook ^iso-8859-1$ windows-1252
| set assumed_charset=windows-1252

    As CP-1252 is a perfect superset of Latin-1 which is a perfect
superset of US-Ascii, this has no adverse effect on normal messages.


Bye, and a happy new year!      Alain.
-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?