<<< Date Index >>>     <<< Thread Index >>>

Re: wrong charset



On Thu, May 14, 2009 at 11:33:37PM -0300, Luis A. Florit wrote:
> * El 11/05/09 a las  1:04, Derek Martin chamullaba:
> 
> > On Thu, May 07, 2009 at 09:40:00PM -0300, Luis A. Florit wrote:
> > > 1) why ?charset=utf-8 if I am working in a ISO-8859-1 xterm?
> >
> > How do you know that the xterm *is* ISO-8859-1?
> 
> This is what I said in my last email:
> 
>     Good question... :o) I wrote that because of three reasons,
>     but maybe I am still wrong. Let's see:
> 
>     1) The xterm has a drop-down menu where you can choose the encoding.
>     I set this as ISO-8859-1.

As I've mentioned a couple of times now, manually setting encodings is
virtually always bad.  If you have your environment set properly, your
applications should always set their encoding properly, unless they
are ancient and/or broken.

>     2) When I switch to UTF-8 in this drop-down menu, I see accents in
>     mutt and UTF-8 files correctly.

Clearly, your system only supports UTF-8.  You have no choice but to
use that.  But that should work for you just fine...  If your intended
recipients can't read UTF-8 for some reason, just configure Mutt's
send_charset so that ISO-8859-1 appears before UTF-8.  If the message
you write can be encoded in ISO-8859-1, Mutt will send it out that way
instead of UTF-8.

> Then, it seems that mutt understands my mameo xterm as an UTF-8, and
> ignores the xterm drop-down menu setting. Or that my xterm is not
> converting back the mutt UTF-8 output into this fake ISO-8859-1.

Mutt does not understand anything about your terminal.  Nor does your
xterm convert anything.

Your terminal program simply displays a glyph contained in the
terminal's selected font, that corresponds to a character code emitted
by whatever program is running in it.  You can manually change the set
of glyphs (i.e. the character encoding), but that will almost always
produce wrong results.  

What Mutt understands is the setting of your locale via the $LANG (and
related) environment variable(s).  It uses the encoding that
corresponds to that locale.  It never gets this wrong, unless your
system is horribly broken.  Set your LANG to pt_BR, and let Mutt and
xterm set their encoding as they will.  Everything will work fine.

If the problem you're having is that you're including text from some
pre-existing file that's encoded in iso-8859-1 and it's coming out
garbled, then use the iconv utility to make a copy of the file
converted to UTF-8, and include that instead.  See the manpage for
iconv for details.

-- 
Derek D. Martin    http://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.

Attachment: pgpFpz5aGsbA0.pgp
Description: PGP signature