<<< Date Index >>>     <<< Thread Index >>>

Re: Reading UTF-8 Mail (was Re: e-mail encoding/formatting)



On Mon, May 01, 2006 at 02:06:13PM -0400, Richard Cobbe wrote:
> Ok, I'll bite.  How does one configure mutt to display UTF-8 mails
> correctly?  After a great deal of effort, I've not succeeded.

I can't give you any Mac-specific advice, since I don't use them, but
I can give you a general run-down.

1. You need to have the right locale settings.  Your locale should be
set to something similar to this:

  $ locale
  LANG=en_US.UTF-8
  LC_CTYPE=ko_KR.UTF-8
  LC_NUMERIC="en_US.UTF-8"
  LC_TIME="en_US.UTF-8"
  LC_COLLATE=C
  LC_MONETARY="en_US.UTF-8"
  LC_MESSAGES="en_US.UTF-8"
  LC_PAPER="en_US.UTF-8"
  LC_NAME="en_US.UTF-8"
  LC_ADDRESS="en_US.UTF-8"
  LC_TELEPHONE="en_US.UTF-8"
  LC_MEASUREMENT="en_US.UTF-8"
  LC_IDENTIFICATION="en_US.UTF-8"
  LC_ALL=

In case it is not clear, these are environment variables.  You can set
them in .profile or whatever initialization file Mac's shell uses.

Note that my LC_CTYPE is set to allow me to enter Korean characters
(though I don't think this is strictly required anymore, but
whatever).  You should leave that unset, most likely.  In the output
above, values which are enclosed in quotes are not set explicitly, but
inherited from $LANG, so you don't have to worry about setting all
that.  LANG should be enough, unless you need the ability to enter
non-Latin characters (which I won't get into now).

Note also that Debian, for example, uses .utf8 instead of .UTF-8 to
identify UTF-8 locales.  Mac may also, I don't know.  If you run
locale -a, you should get a list of all the locales supported by your
system.  Hopefully, some of them will clearly be UTF-8 locales.

2. Your terminal program (if you're using one) needs to support
multi-byte Unicode characters.  If it does not, you're hosed.  Xterm
generally does recently, though I can't say WRT the version that Mac
uses.

3. Your font needs to be a Unicode font, AND it needs to have the
glyphs you're trying to display.  This is a sticking point for a lot
of people.  You MUST choose an iso-10646 registry font, but even if
you do, it most likely will only have iso-8859-* characters in it.  It
may or may not have the curly quotes and elipsis that Kyle is rather
fond of.  These days the X window system comes with at least 2
suitable fonts; but again, that doesn't mean that Macs have them.  You
can try setting one of the following in your .Xdefaults file:

  XTerm*font: -misc-fixed-medium-r-semicondensed-*-13-*-*-*-*-*-iso10646-*
  XTerm*font: -Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1

The second one is more complete, but it probably won't affect you in
any noticable way.  The only other practical difference between the
two fonts is size.  The second is larger, corresponding to the
equivalent of running xterm with -fn 9x15 on most Unix systems, if
that means anything to you.  The first one corresponds to the "fixed"
font (default Xterm font).

4. Your application needs to support Unicode.  Mutt does, so you're
all set there, though you may need to make sure it's compiled with
various language-related options.  On my system, this seems to happen
automatically.

5.  You need some kind of encoding conversion program, such as iconv.
This also should not be an issue.

I don't think I missed anything.  If you followed all that, you should
be able to run mutt in a UTF-8 locale and display all (well, most)
UTF-8 characters correctly, for every language which is supported by
your OS.

-- 
Derek D. Martin    http://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail.  Sorry for the inconvenience.  Thank the spammers.

Attachment: pgpOMi7d5laYs.pgp
Description: PGP signature