Re: $assumed_charset settings (was: special chars)
On Sunday, March 25, 2007 at 13:30:12 +0200, Christoph Berg wrote:
> CP1251 (iirc, the last digit could be different) is what windows
> usually uses. It is a superset of latin1, with characters in the
> 128-159 range, mainly some quotes. That could also be a good guess.
Right: CP-1252 is the optimal $assumed_charset for westerners at
large, today. It's what I use and advice. It's 95% efficient on all
sorts of non-MIME messages, from Windows mailers, from Usenet, from
various webmails, and so on...
However, the day $assumed_charset for bodies can take a list, then
$assumed_charset="utf-8:cp1252" will be yet better.
I mean: As a westerner, the non-spam non-MIME mails I receive are in
overhelming majority in CP-1252 or a subset. Then a small percentage is
in UTF-8. Then a yet smaller percentage in Latin-9. Then the rest is in
any other charset. For Latin-9 and others, there is anyway no generic
fix not breaking the majority. Rather rare case, manual <edit-type>
might help. Covering CP-1252 and subsets fixes 95% of the problem. Also
covering UTF-8 would fix yet more.
« if you believe 95% of statistics, I've got a bridge to sell you. »