<<< Date Index >>>     <<< Thread Index >>>

Re: problem with utf-8 encoding using mutt + vim: solved!



 On Thursday, July 28, 2005 at 9:43:25 AM -0300, Fernando Canizo wrote:

> El 27/jul/2005 a las 20:43 -0300, Alain me decía:
>> $attribution string contains a "í" i acute U+00ED coded in Latin-1
> Incredible! This was the problem.

    Cool! I'm happy it helped, and the mind game was a pleasure.


> From what i've seen in my search for a solution, this problem arise in
> several ways

    Yes: Most frequently $signature, but also $attribution, $locale,
$indent_string, $post_indent_string, $signon/$signoff, and aliases. Then
also quoting broken mails, but that's less the same problem.


> i'm using this for the .signature (just in case):
>| set signature="fortune -s | iconv -t utf-8 |"

    Hum... Fortune outputs unconverted strings, right? If you populate
its file with UTF-8 only, it makes more sense to iconv *from* UTF-8 to
current locale charset. It will work in any locale.

| set signature="fortune -s | iconv -f utf-8 |"


>> convert once for all the full muttrc to UTF-8
> converting, i want a system full utf-8 compliant ;)

    Fine. You can add "set config_charset=utf-8" at beginning of muttrc,
so it will become usable in any locale.


>>>| LC_ALL=es_AR.utf-8
> Is not a default for everything else?

    LC_ALL is an override. LANG is the default.

    If you use LC_ALL, you lose ability to fine tune individual
categories, and to quickly override standard settings for a Russian test
ride with a simple unset at the end.

    Localization, for the vast majority, it's LANG only. Some people
like to additionally tweak this or that LC_* category. And LC_ALL is
mostly reserved for one-off test purpose.


> i'm using a full compliant spanish attribution, but maybe english
> speakers just don't care ;)

    Right: An English attribution is nicer when your readership is
international. I do it English and French with 2 files, first
~/.mutt/id.international:

| set attribution="Hello %v,\n\n On %d, %n wrote:\n"
| set date_format="`case $OSTYPE in \
|   cygwin) \
|     echo "!%A, %B %e, %Y at %I:%M:%S %p %Z";; \
|   *) \
|     echo "!%A, %B %-d, %Y at %-I:%M:%S %p %Z";; \
| esac`"

    Second ~/.mutt/id.francophone:

| set config_charset=iso-8859-1
| set attribution="Bonjour %v,\n\n Le %d, %n écrivait:\n"
| set date_format="`case $OSTYPE in \
|   cygwin) \
|     echo "%A %e %B %Y à %H:%M:%S %Z";; \
|   *) \
|     echo "%A %-d %B %Y à %-H:%M:%S %Z";; \
| esac`"

    These files are sourced from folder-hooks, with a French default:

| folder-hook .                 "source ~/.mutt/id.francophone"
| folder-hook mutt-users$       "source ~/.mutt/id.international"
| folder-hook mutt-users-fr$    "source ~/.mutt/id.francophone"

    And when needed I can manually override with two macros:

| macro index \eii "<enter-command>source ~/.mutt/id.international<Enter>"
| macro index \eif "<enter-command>source ~/.mutt/id.francophone<Enter>"

    For this to work in any charset French locale, I take $locale
automatically from LC_TIME category:

| set locale=`echo "${LC_ALL:-${LC_TIME:-${LANG}}}"`

    Results are:

| On Thursday, July 28, 2005 at 9:43:25 AM -0300, Fernando Canizo wrote:
| Le jeudi 28 juillet 2005 à 9:43:25 -0300, Fernando Canizo écrivait:

    You see I like canonical form of date, for human readers. I hate
abbreviations nearly as much as ISO 8601: We are not machines. What
would be the Spanish variant id.hispanophone?


> possible to play with $attribution using an extern command like you do
> in $signature?

| set attribution=`command`     # note backtics


> ^A    --Larry Wall

    Spurious byte 01 detected in signature. ;-)


Bye!    Alain.
-- 
Everything about locales on Sven Mascheck's excellent site at new
location <URL:http://www.in-ulm.de/~mascheck/locale/>. The little tester
utility is at <URL:http://www.in-ulm.de/~mascheck/locale/checklocale.c>.