<<< Date Index >>>     <<< Thread Index >>>

Re: Reading UTF-8 Mail (was Re: e-mail encoding/formatting)



On Monday, May  1 at 02:06 PM, quoth Richard Cobbe:
Ok, I'll bite.  How does one configure mutt to display UTF-8 mails
correctly?  After a great deal of effort, I've not succeeded.

Mac OS X 10.4.6.

This is the system that I most commonly use.

Typically, mutt will "just work" if you configure your environment appropriately. One easy way to do that is to use the script "uxterm" which will launch an xterm with all the right environment variables set up for you. Within that xterm, run mutt, and you should be in business.

There is one possible "gotcha" here, and that is your curses library. I *think* the standard ncurses library that comes with OSX 10.4.6 supports wide characters (aka. Unicode characters), but if you've installed your own (e.g. with fink), or if you're using slang, it may not. You need to make sure, with fink for example, to install ncursesw (note the "w" on the end).

[vetinari:~]$ mutt -v
Mutt 1.4.2.1i (2004-02-12)

Hmm, I use the mutt out of CVS... I believe they both auto-detect the charset, but I've never tested it. If it doesn't set your $charset variable properly even though your LC_CTYPE variable is set correctly, you can add this to your muttrc:

   set charset="utf-8"

You can test the $charset variable by executing:

   :set ?charset

within mutt.

This has failed in both xterm (with LC_CTYPE=en_US.UTF-8) *and* in
Apple's Terminal.app, which is configured for UTF-8 text.

What did you try?

Once you have a terminal that is configured to support UTF-8 (e.g. Apple's Terminal with the proper settings, or uxterm), you need to let the programs that run within it know how that the expanded character set is available. The way this is done is with the LC_CTYPE variable, like you have there (read: that needs to be set for both Apple's Terminal.app and for uxterm). The exact contents of that variable depend on the system, but based on the output of "locale -a" on my 10.4.6 system, what you have there looks right.

Problem 1: the folder index is completely broken. The threading characters fail to display correctly, and update is broken. Scrolling backwards through a large folder results in a completely corrupted display; see <http://www.ccs.neu.edu/home/cobbe/misc/mutt-utf-8/broken-index.png> for a screenshot. (This and all other screenshots display the output in Terminal.app; xterm's output is similar.) Setting ascii_chars to yes helps, but I shouldn't have to do that -- the whole point of this exercise is to display extended characters correctly!

That looks like a problem caused by a terminal library (e.g. ncurses) that doesn't support wide characters.

This isn't just an xterm or terminal problem. When I construct a small text file containing three copies of a box drawing character (Unicode 0x251C), cat displays it fine in both Terminal.app and a unicode xterm.

Yup, that's consistent with an ncurses problem.

Problem 2: displaying email messages with extended chars also fails, though less spectacularly. See <http://www.ccs.neu.edu/home/cobbe/misc/mutt-utf-8/broken-quotes.png> for a screenshot of the paragraph that I quoted from Kyle's message above, rendered in Terminal.app.

Yup, I'm pretty sure that's an ncurses problem.

Suggestions welcome. Using an X client as my terminal is not something I'm willing to give up, though, for a variety of unrelated reasons.

Is this an issue of not using a recent enough Mutt?

Probably not.

Or has computer technology not, in fact, caught up with hot metal typesetting?

Heh :) Don't worry, it has. I cause all my utf-8 mayhem on this list with MacOS 10.4.6.

~Kyle
--
These are the times that try men's souls. The summer soldier and the sunshine patriot will, in this crisis, shrink from the service of their country; but he that stands it now, deserves the love and thanks of man and woman.
                                                      -- Thomas Paine

Attachment: pgpvS51SdZlrV.pgp
Description: PGP signature