<<< Date Index >>>     <<< Thread Index >>>

Re: wrong charset



* El 07/05/09 a las 23:24, Kyle Wheeler chamullaba:

> On Thursday, May  7 at 09:40 PM, quoth Luis A. Florit:
> > > On Monday, May  4 at 05:05 PM, quoth Luis A. Florit:
> > > > I use a ISO-8859-1 encoded xterm in maemo, but :set ?charset
> > > > gives me charset="utf-8".
> > >
> > > Are you setting it in your config somewhere? (test it my running
> > > `mutt - -F /dev/null` and seeing what the value of $charset is
> > > there)
> >
> > utf-8. No mater what I do...
> >
> > But I have three charsets:
> >
> > $charset=//TRANSLIT
> > ?charset=utf-8
>
> What? That doesn't make any sense. Are those two lines actually in
> your muttrc?

The only thing in my .muttrc is 'set charset=//TRANSLIT'. But no
matter how I change that, the result is always utf-8.

> I think at least part of the problem here is that you aren't
> understanding what ?charset means. The "charset" variable is almost
> always referred to as $charset, with the dollar sign. If you swap the
> dollar sign for a question mark, that's a way of telling mutt you want
> it to display the value of the variable. It's NOT a way to set the
> variable.

I see.

> In other words, "set ?charset=us-ascii" is completely bogus, and
> meaningless.

Sorry, I think I wasn't clear enough. In mutt console, you can
look for variables with the ':set' command. When I do
':set charset' I get 'charset="//TRANSLIT"' (as expected, although
in this case it means UTF-8 despite of the fact that my xterm is
ISO-8859-1). If I change to iso-8859-1, I get accented characters as \123.

> > In fact, it seems that I am not able to change that ?charset
> > variable to ISO-8859-1.
>
> So, if, while running mutt, you execute the command:
>
>     :set charset=iso-8859-1
>
> What happens? Does an error get displayed?

Now it works... sorry. I should have been doing something silly.

> > > > I tried setting by hand LANG, LC_ALL, LC_CTYPE to pt_BR and
> > > > such, but no luck. No, pt_BR.ISO-8859-1 is not among the xterm
> > > > locales.
> > >
> > > Okay, I think the first thing you need to do here (aside from
> > > ensure that you're not setting $charset manually somewhere) is
> > > to find out what locales your machine supports. Something like
> > > this will probably work:
> > >
> > > locale -a | grep '^pt_BR'
> >
> > Just pt_BR. But I don't want to change the default language, just
> > the encoding.
>
> If pt_BR is the only pt_BR-related locale you have installed, then
> you're stuck with the default charset, whatever that happens to be.
> If you want access to other charsets (such as utf-8), you'll have to
> install additional locales (e.g. the locale named pt_BR.utf8).
>
> On debian, this can be done by using `dpkg-reconfigure locale`. I'm
> sure other distributions have similar means of installing/enabling
> additional locales.

I have always used as locale 'LANG=en_US' in a ISO-8859-1 rxvt
console, and 'charset=\\TRANSLIT' or iso-8859-1 in muttrc, and
everything worked fine. So I don't think this has to do with
locales. It seems that mutt does not understand the osso-xterm...?

> > > Whatever it outputs, those are the values your computer
> > > (currently) understands, and so those are the values that LANG
> > > or the LC_* variables can be set to.
> >
> > I see. But even if I set LC_ALL=pt_BR, I get the messages in
> > Portuguese but the encoding in UTF-8. Exactly the opposite that I
> > want.
>
> What do you mean "the encoding in UTF-8"? You mean the messages you
> receive are encoded in UTF-8? That's fine; it doesn't matter what
> the messages are encoded in, as long as mutt (and all the supporting
> libraries it uses) know what characters can be displayed, so that it
> can convert from the message's encoding to the correct encoding for
> display on your terminal.

My terminal is ISO-8859-1, so if mutt displays messages in UTF-8
it will be a mess. The whole and single issue is that mutt does not
display ISO-8859-1 chars in a correct way to my console.

> > > It's possible that if you really want your xterm to only display
> > > ISO-8859-1 characters, you may have to install the right character
> > > sets (how to do this is often distro-dependent).
> >
> > But my xterm works perfectly with ISO-8859-1, for example, vim does.
> > That is not the problem, but that mutt just does not want to
> > understand the encoding.
>
> Okay, we're over-using the term "encoding" here. Let's try and be
> clear about what's going on:
>
> 1. When you run mutt, it reports that the charset it thinks is
> appropriate is utf-8

Yes, if I use 'set charset=//TRANSLIT'

> 2. Nothing you seem to do can convince mutt to avoid utf-8

No, now it accepts 'set charset=iso-8859-1', but still displays
accented characters as \123.

> It sounds like somewhere in your mutt config, you're setting $charset
> to be utf-8, and then attempting (perhaps with the wrong syntax) to
> set it to be something else.

No, just one line with charset.

> Mutt's generally pretty good at figuring out the right $charset value
> to use, if you leave it to its own devices.
>
> > > On the other hand, if your machine ALREADY correctly understands
> > > UTF8... go with it! UTF8 is far more capable than ISO-8859-1 or any
> > > other ISO charset.
> >
> > Several years ago I tried UTF-8, but the vast majority (I mean,
> > almost 100%) of the emails/texts/etc I read/save are ISO-8859-1
> > (that are not correctly displayed in a UTF8 console). I don't need
> > any of the non-ISO characters in UTF-8.
>
> This is probably not worth arguing about... BUT - ISO-8859-1 files
> *should* display properly on a UTF8 console. If it doesn't, then
> something (your terminal, your text reader, whatever) is broken.

Well, I use Redhat/Fedora since Redhat 4.1, and ISO text has never
been displayed properly in xterm/rxvt. And the Nokia N800 is no
exception. But perhaps there is something I can do to make that work
properly...

> Now, I admit, mutt isn't very clear in this respect, because it's
> unlike anything else. But "charset" is the name of the variable.
> "$charset" is usually the way it's referred to. However, since mutt
> doesn't have an "echo" command or anything similar, one of the
> developers (I don't know who) thought that a convenient way of
> displaying the current value of a variable would be to refer to it
> with a question mark. In other words "set ?charset" means "what is the
> $charset variable set to?".

I see. I didn't know this. I searched the whole manual for '?' and it
says nothing about variables!

    Thanks for your efforts!!

        Best,

            L.

Attachment: pgplZwsx58WvE.pgp
Description: PGP signature