<<< Date Index >>>     <<< Thread Index >>>

Re: Charset in Headers



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday, October 28 at 09:56 PM, quoth Raphael Brunner:
>> my question is, sometimes I became a mail with a correct charset
>> displayed in (e.g.) mozilla thunderbird, but in mutt, there are other
>> strange characters in the subject, but ONLY there.

That probably means that the subject isn't properly encoded, because 
Thunderbird has a few more heuristics for guessing the correct 
encoding. For example, they can look at the body of the message and 
guess that any incorrectly encoded headers are probably in the same 
encoding as the body of the message, whereas mutt evaluates each 
message component individually and does not look at other parts of the 
message to help make informed guesses.

The bottom-line, though, is that without proper labeling, it's all 
just guessing. In this case, Thunderbird's method of guessing turned 
out to provide a useful answer. In other situations, Thunderbird's 
method may not guess correctly, and mutt's method may be more 
accurate. We're dealing with broken email here, and displaying such 
email "correctly" with any reliability is essentially impossible.

>> Now, I think, the header is coded as utf8, but mutt don't know 
>> about it.

Why do you think the header is coded as utf8?

Headers aren't allowed to be anything other than US-ASCII; anything 
else is technically wrong.

>> google show that iconv is not a working solution.

What?

>> The characters are koir-8 coded,

I thought you said they were utf8 coded.

>> and thunderbird show this as popup on any message.

What?

It sounds like you're not a native english speaker, so I don't mean to 
exhibit frustration. It's just that some of the sentences you have 
used are extremely difficult to understand.

> note: if I save such a message in any folder and then open it with 
> kate (from KDE), set there the charset to cyrillic>koi8-u , then the
> subject-characters are displayed correct.

Interesting. So whoever it is is sending raw, unlabeled koi8-u 
characters in the subject header. That's *extremely* wrong; they need 
to use a better email program, because sending non-ascii headers 
without any sort of labeled encoding violates several parts of the 
spec. They're lucky that their email comes through at all.

> what would be the cleanest solution? The most mails are western 
> europa.

The cleanest solution would be for the sender to fix their email 
program. :)

You can try fiddling with your assumed_charset settings, but if most 
of your messages are western european, that's going to be a tough one 
to make work.

~Kyle
- -- 
A patriot must always be ready to defend his country against his 
government.
                                                        -- Edward Abbey
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!

iEYEARECAAYFAkkHiVAACgkQBkIOoMqOI17oFgCfWGL4VtyhwATcSUpU9klA2vXE
TYMAniM2znfoKv3uCYjf4SfWX6SqwaF9
=FQVY
-----END PGP SIGNATURE-----