Re: wrong charset
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
On Tuesday, May 12 at 02:56 PM, quoth Luis A. Florit:
> I did it, and mutt sets charset=utf-8.
On the Nokia? Then the Nokia's locales must all be UTF-8-only.
> Because ':set ?charset' gives 'charset=utf-8', and because the
> accented characters appear as garbage.
Okay, so, it would appear that the correct character set (the only one
your locales support) is utf-8. The next question is: why do the
accented characters appear as garbage?
But let's first be clear: when LANG is correctly set to pt_PT and mutt
has correctly detected $charset to be 'utf-8', do the accented
characters show up as \123 or do they show up as random characters,
such as: ’ ?
If it shows up as random characters, then it would seem to me that the
Nokia is *lying*; that its terminal is, in fact, incapable of
understanding UTF-8, because it's being sent valid UTF8 character
codes and is treating them as Windows-1252 characters instead.
However, if it shows up as \123, then for some reason, mutt thinks
that those characters are non-printable, and is attempting to mask
them. In this case, what may be happening is that the underlying
libraries that mutt relies on are broken and/or unreliable. To work
around these problems, you may need to recompile mutt (and reconfigure
it). Specifically, when you run mutt's ./configure program, add the
- --without-wc-funcs and maybe add the --enable-locales-fix. Here's what
mutt's build documentation has to say about these:
--enable-locales-fix
on some systems, the result of isprint() can't be used
reliably to decide which characters are printable, even if you
set the LANG environment variable. If you set this option,
Mutt will assume all characters in the ISO-8859-* range are
printable. If you leave it unset, Mutt will attempt to use
isprint() if either of the environment variables LANG, LC_ALL
or LC_CTYPE is set, and will revert to the ISO-8859-* range if
they aren't. If you need --enable-locales-fix then you will
probably need --without-wc-funcs too. However, on a correctly
configured modern system you shouldn't need either (try
setting LANG, LC_CTYPE, or LC_ALL instead).
--without-wc-funcs
by default Mutt uses the functions mbrtowc(), wctomb() and
wcwidth() provided by the system, when they are available.
With this option Mutt will use its own version of those
functions, which should work with 8-bit display charsets,
UTF-8, euc-jp or shift_jis, even if the system doesn't
normally support those multibyte charsets.
If you find Mutt is displaying non-ascii characters as octal
escape sequences (e.g. \243), even though you have set LANG
and LC_CTYPE correctly, then you might find you can solve the
problem with either or both of --enable-locales-fix and
--without-wc-funcs.
Does that make sense?
>> The terminal shouldn't matter in this case.
>
> Perhaps I should have said this before, but I use the very same .muttrc
> in Fedora and Nokia. Both Fedora's iso-8859-1 rxvt and xterm
> show chars perfectly. It's the Nokia that doesn't. And both have
> the same locales: everything as en_US. What could it be, but the
> console?
More likely than not, it's the system's string manipulation libraries.
There are, of course, more things that can go wrong. The terminal may
be the problem (but it's usually not), you may also not have the right
fonts for the terminal, etc. etc. But those are unusual problems these
days, and library and/or locale issues are far more common. I'm
operating on the zebra principle here: if you hear hoofbeats, think
horses, not zebras. I suppose it *could* be the terminal, but let's
eliminate the other options first.
>> perl -e ""
>>
>> That SHOULD do nothing at all.
>
> Yep, nothing.
Excellent!
> Perhaps all the Nokia locales are UTF-8 based...?
Probably. Which would make it all the more annoying if their string
manipulation libraries cannot handle UTF-8.
~Kyle
- --
Brilliance is like four-wheel drive; it enables a person to get stuck
in even more remote places.
-- Garrison Keillor
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!
iQIcBAEBCAAGBQJKCb+PAAoJECuveozR/AWeNNMP/jnz8JoM6ABSOGOlrrIPmabx
xKvfTT7uTqhyo+cfgGSXTjqiTiThXAEWoPb1KDbDX5EC+0rU4xmdyWDpAzHaqidq
wYKEJP0keVXNWp0+lO3qwOtJWSs+DBpUqGgbZ8ZQ0TUfy3xsyT2TD6rTjdUzjjyw
ZyEGwkUTbwLSl7BaBwrzSXWvPQXeFlI2wTbzM4IHZswnMBLQOtQIylg5LSIFl2Kn
1XbdepEkOoCj9WXcyLXRCc3mmWFxl4POaSeydCr7Z+H/TdwP734KZqqzscvZHbeP
xjMPGSm9b3moZRX7jjwsWUTOIXtpEO22A/adtxPFE/qFoxR8yvA1gEfOCK+elFLu
FBswMAQcaZa6u4dzY1XtEKYZdqeZXZbJZUS+LX2PQmvvBYJun3C7Jhrj3YtuixU9
5JHA+gp+tqdSxjz8o4wK4c9qAl+MPeSn039mUBx8uzh8rMDVOe1XtwwLM5OVMa+e
qJmo3lqRNdVds/vhRnogdsfoI07WNksTbNhYYTAcCSl6vDCTXf/Q9IvJaSF5VBNj
FuDJ8tIDZ8IooaEfku+x//bq+PHglLmv2iXHokz6obflm31G0eUUp1J3zfY2Vx2V
++Oz9Heqf70uFf8ER9+VaXc8JWuwbCFKmMp2qljEalRwlUpHUn/y+hvNtmaK/DoV
snTiucn+fLorWOtMG6ju
=GtN6
-----END PGP SIGNATURE-----