<<< Date Index >>>     <<< Thread Index >>>

Re: More on non-ascii chars in headers



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday, September 27 at 12:56 PM, quoth Eyolf Østrem:
>This puzzled me at first, because I didn't know where that latin1 
>coding came from, but I assume it is because of send_charset or 
>assumed_charset, right?

Yup. It's send_charset that matters in this case.

>set assumed_charset ="us-ascii:windows-1252:latin-1:utf-8"

For what it's worth, this setting is pretty pointless for most 
Westerners. The best setting for Westerners is:

     set assumed_charset="windows-1252"

The reason this is better than what you had is:

1. There's no advantage to assuming a message is us-ascii instead of 
    windows-1252. Windows-1252 is a superset of us-ascii, so any 
    message in us-ascii can be assumed to be windows-1252 without loss.
2. There's no advantage to falling-back to latin-1 after assuming 
    windows-1252. Windows-1252 is a superset of latin-1, so any message 
    that can be successfully interpreted as windows-1252 will never 
    *need* to fall back to latin-1.
3. Along similar lines, windows-1252 contains the entire set of 
    possible values, 0 to 255, and has a character assigned to each. 
    Thus, no email will *ever* not match windows-1252. The way mutt 
    figures out that a message isn't in a specific character set is if  
    there are values in the message that aren't valid in the character 
    set. For example, in Latin-1, values 0x00 through 0x1F are unused; 
    thus if they appear in an email, it cannot be encoded in Latin-1. 
    Windows-1252 may not always be the *right* character set, but 
    there's no way for mutt to know that.

>set send_charset="us-ascii:iso-8859-1:iso-8859-15:utf-8"

It's probably worth inserting windows-1252 in there, since it's so 
popular. My send_charset looks like this:

set send_charset="us-ascii:iso-8859-1:iso-8859-15:windows-1252:utf-8"

>set charset=utf-8

Chances are, you don't want to set this in your muttrc. Mutt can 
figure it out based on your terminal's locale settings, which allows 
mutt to be more flexible.

>so since the "Ø" is part of latin1, that's as far up in the encodings
>that mutt will have to go, and will therefore send it as latin1, is
>that correct?

Yup!

~Kyle
- -- 
No, I don't know that Atheists should be considered as citizens, nor 
should they be considered patriots. This is one nation under God.
                                  -- George H. W. Bush, August 27, 1987
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!

iD8DBQFG+7orBkIOoMqOI14RAlKQAJ9hVOYFv4SsJRMnf89nK9AEn6+zYwCeIB8r
bpOHGsPUu53TtE6VGig9qgE=
=hjc6
-----END PGP SIGNATURE-----