<<< Date Index >>>     <<< Thread Index >>>

Re: w3m can't show html mail with charset: gb2312



Haizi Zheng wrote:
> bxuefeng wrote:
> > Subject: Re: 有吗
> 
> Have you noticed that in fact the subject line (as well as the address 
> lines) are totally nuked? Where naked means that the characters are plain,
> they haven't been encoded. An encoded one must looks like this (for example):
> 
> Subject: =?gb18030?B?suLK1A==?=
> 
> This one told mutt and many other mail client softwares that the subject
> is gb18030 encoded, so they now know how to display them.
> 
> But if the subject is plain, mutt has no idea about what coding system should
> be applied on, so it will use the default setting.

Agreed.

> Since your locale is en_US.UTF-8, and can display the subject correctly, so
> I guess the plain subject is IN FACT UTF-8 ENCODED. But I'm really not sure 
> about this, you'd better check it out, which will help us to settle this 
> problem.

I agree.  But there is conflicting information.  Haizi Zheng's message
back to the list was encoded with gb18030 and the raw characters
quoted there rendered fine.  That leads me to believe that the charset
those were encoded with was actually gb18030.  But since the message
was a remail it could be that the characters were converted by the
mailer to gb18030 automatically.  That is probably the case since the
original poster said that the raw subjects rendered fine in UTF-8 and
I agree completely with your analysis.

> You can press `e' and call the editor out, say Emacs, then if your
> editor is powerful enough, you can get some information about the
> coding system.

I am skeptical of the use of the editor function to determine too much
information.  Frequently the user has already overridden the default
charset for the editor in some completey different way.  I use Emacs
for example and force a UTF-8 charset.

> As the `charset' indicates, the body is gb2312 encoded, so if you apply utf-8
> on the text, surely you can't get the right result!
> 
> Here's a summary: 

A good analysis.

Bob