Re: Subject üî
On Monday, September 3, 2007 at 15:22:31 -0400, dv1445@xxxxxxxxx wrote:
> my muttrc, which wants to see "set charset=UTF-8" instead of
> "=en_US.UTF-8", while the latter is what $LANG wants to see.
Exactly. When I said they have to agree, it was about the charset
only. However, muttrc doesn't want to set $charset: The default value is
automatically derived from the current locale. This automatic link was
absent or broken in previous versions, but since around
MacOS 10.3 Panther, it works well.
> in the UTF-8 mode, I see that my subject line in the header is encoded
> in ISO-8859-1.
Feature: Upon sending a text, Mutt determines the first best suited
minimal necessary and sufficient charset declared in the $send_charset
list.
> I have to do something to xterm to force it to work under UTF-8
Right. Try to start it via the uxterm command.
> I should be doing something to my vimrc to make vim write the file in
> the right encoding.
That may very well be the case: Mutt's $editor has to dumbly read
and write in locale's charset, period. Smart charset auto-sensing
(Vim's $fileencodings) is useless (to Mutt), and can be harmfull (when
it gives the apparence of rightness to broken characters). However, that
doesn't seem to be your problem of today.
> Ångström. SØren. Cristóbal. Æneid. Straße. ¼dipus.
All seems well in body. Subject is "[]üîå[]", meaning "uia" flanked
by 2 U+FFFD replacement characters (in my font they look like empty
squares). That seems to be what you described sending. Next question is:
What added those U+FFFD chars??
I dropped the replacements, and just kept the "uia" in subject:
Please reply with your UTF setup. So we'll see if replacements chars are
reintroduced when you don't type special chars yourself.
> Is it better to do Latin 1 or UTF-8?
Latin-1 is extremely common, and contains the characters needed by
most western languages. Spanish, German, French, Portugese, and some
such. Well... It lacks the Euro symbol ¤, and the oe ligatures ½ ¼.
UTF-8 is a form of Unicode, containing all characters of the world.
Including the above, plus Polish, Russian, Chinese ideograms, Tamilian,
and all such. It is however a little less portable, a little more
difficult to setup (nothing hairy), and on Macs a little less stable.
My advice would be to make the (little) effort to setup UTF-8 for
yourself. And for sending, trust Mutt's optimal choice. As for setting
a right $send_charset list, it depends on the languages you're writing.
The default $send_charset is a good start, and look at my infosig for
another example.
On Monday, September 3, 2007 at 16:10:16 -0400, dv1445@xxxxxxxxx wrote:
> Wow, I can see myself that this looks terrible. So does the subject
> line
The parent mail looks good (outside of the U+FFFD pair). You see it
broken because you broke your Latin-1 setup: Do not set $charset.
> when I set everything to UTF-8, the little arrows drawn in thread view
> are all wrong.
Those semigraphic chars should work provided that $charset=utf-8
exactly (that's the default value derived from LANG=en_US.UTF-8), that
TERM=nsterm-16color, and that you have unchecked Terminal.app's "Wide
glyphs for Japanese/Chinese/etc." setting. If they are still wrong,
please describe how they look like.
Bye! Alain.
--
Mutt muttrc tip to send mails in best adapted first necessary and sufficient
charset (version for East Europe Latin-2/CP-852/CP-1250 terminal users):
set
send_charset="us-ascii:iso-8859-1:iso-8859-15:windows-1252:iso-8859-2:windows-1250:utf-8"