<<< Date Index >>>     <<< Thread Index >>>

Re: More on non-ascii chars in headers



On 27.09.2007 (09:11), Kyle Wheeler wrote:

> >set assumed_charset ="us-ascii:windows-1252:latin-1:utf-8"
> 
> For what it's worth, this setting is pretty pointless for most 
> Westerners. The best setting for Westerners is:
> 
>      set assumed_charset="windows-1252"
> 
> The reason this is better than what you had is:
> 
> 1. There's no advantage to assuming a message is us-ascii instead of 
>     windows-1252. Windows-1252 is a superset of us-ascii, so any 
>     message in us-ascii can be assumed to be windows-1252 without loss.

Point taken.

> 3. Along similar lines, windows-1252 contains the entire set of 
>     possible values, 0 to 255, and has a character assigned to each. 
>     Thus, no email will *ever* not match windows-1252. The way mutt 
>     figures out that a message isn't in a specific character set is if  
>     there are values in the message that aren't valid in the character 
>     set. For example, in Latin-1, values 0x00 through 0x1F are unused; 
>     thus if they appear in an email, it cannot be encoded in Latin-1. 
>     Windows-1252 may not always be the *right* character set, but 
>     there's no way for mutt to know that.

But should I still remove utf8 from that list?  What if I receive a
message with characters which are NOT in Windows-1252 but in utf8? or
will mutt then fall back on the locale settings and manage in any
case? (not that it happens very often, I think, but one never
knows...)  Will they then still match Windows-1252 but with the wrong
characters?

eyolf
 

-- 
System going down in 5 minutes.