Re: Demoroniser (was: Display Filters)
On 2006-07-04, Alain Bench <veronatif@xxxxxxx> wrote:
> On Monday, July 3, 2006 at 19:16:58 -0700, Gary Johnson wrote:
>
> > [iso8859toascii.c] converts some characters in the range 128-255 to
> > strings of one or more ASCII characters.
>
> Very similar to Demoroniser, with some pros and cons. But I'm
> affraid most limitations and drawbacks I listed are the same, as they
> are in fact inherant to the $display_filter approach.
>
> OTOS I can now guess that you got from this filter more benefit than
> the average guy, because of specificities of you setup. You are more
> frequently in my 3rd point case, and less or never in the 1st case. More
> \200 octalisations and chances for filter action, than ?-masks. Guessed
> that seeing:
>
>
> > User-Agent: Mutt/1.5.9i
> > Content-Type: text/plain; charset=iso-8859-1
> >> The U+20AC '?' Euro symbol
> >> the U+0192 '?' hooked letter f
> >> The U+2030 '?' per mille sign
>
> Gaargl! Mutt sent out ugly Outlook-like lying MIME charset label...
> Scary! Contrary to the very common confusion, ISO-8859-1 and CP-1252 are
> different charsets (your filter should be named cp1252toascii.c). And
> the characters I wrote do not exist in ISO-8859-1. This means something
> is broken, probably iconv, and should be fixed. What is the output of:
>
> | $ printf "\x80 \x83 \x89\n" | iconv -f windows-1252 -t us-ascii//TRANSLIT
> | EUR f o/oo
After discovering that printf on this SunOS 5.8 system does not
support the \x escape and converting to octal:
$ printf "\200 \203 \211\n" | iconv -f windows-1252 -t us-ascii//TRANSLIT
EUR f o/oo
> | $ printf "\x80 \x83 \x89\n" | iconv -f cp1252 -t us-ascii//TRANSLIT
> | EUR f o/oo
$ printf "\200 \203 \211\n" | iconv -f cp1252 -t us-ascii//TRANSLIT
EUR f o/oo
> |
> | $ mutt -v
Mutt 1.5.9i (2005-03-13)
Copyright (C) 1996-2002 Michael R. Elkins and others.
Mutt comes with ABSOLUTELY NO WARRANTY; for details type `mutt -vv'.
Mutt is free software, and you are welcome to redistribute it
under certain conditions; type `mutt -vv' for details.
System: SunOS 5.8 (sun4u) [using ncurses 5.4]
Compile options:
-DOMAIN
+DEBUG
-HOMESPOOL -USE_SETGID +USE_DOTLOCK -DL_STANDALONE
+USE_FCNTL -USE_FLOCK -USE_INODESORT
+USE_POP -USE_IMAP -USE_GSS -USE_SSL -USE_GNUTLS -USE_SASL -USE_SASL2
+HAVE_REGCOMP -USE_GNU_REGEX
+HAVE_COLOR +HAVE_START_COLOR +HAVE_TYPEAHEAD +HAVE_BKGDSET
+HAVE_CURS_SET +HAVE_META +HAVE_RESIZETERM
+CRYPT_BACKEND_CLASSIC_PGP +CRYPT_BACKEND_CLASSIC_SMIME -CRYPT_BACKEND_GPGME
+BUFFY_SIZE -EXACT_ADDRESS -SUN_ATTACHMENT
+ENABLE_NLS -LOCALES_HACK +HAVE_WC_FUNCS +HAVE_LANGINFO_CODESET
+HAVE_LANGINFO_YESEXPR
+HAVE_ICONV +ICONV_NONTRANS -HAVE_LIBIDN +HAVE_GETSID +HAVE_GETADDRINFO
-USE_HCACHE
ISPELL="/opt/TWWfsw/bin/ispell"
SENDMAIL="/usr/lib/sendmail"
MAILPATH="/var/mail"
PKGDATADIR="/home/garyjohn/src/SunOS/mutt-1.5.9i/share/mutt"
SYSCONFDIR="/home/garyjohn/src/SunOS/mutt-1.5.9i/etc"
EXECSHELL="/bin/sh"
-MIXMASTER
To contact the developers, please mail to <mutt-dev@xxxxxxxx>.
To report a bug, please use the flea(1) utility.
patch-1.5.5.1.gj.sigontop_space_fix.1
patch-1.5.5.1.gj.attach_sanitize.1
patch-1.5.5.1.gj.stuff_all_quoted.3
> And what is in Mutt the value of ":set ?charset"
charset="iso-8859-1"
> > do you know of a HOW-TO or bootstrap procedure I could follow to get
> > this working?
>
> The Mutt Wiki <URL:http://wiki.mutt.org/?MuttFaq/Charset> has nearly
> everything from base settings to advanced solutions for some corner
> problems. But not much about X, fonts, and such.
>
> I'd say that the first step should be to determine what charset
> exactly does you current terminal and font display. Please describe what
> you see doing at shell:
>
> | $ printf "\xC3\xBC \x9E \n"
> | ü ?
>
> - capital A with tilde, 1/4 symbol, small z with caron ==> CP-1252
> - capital A with tilde, 1/4 symbol, nothing ==> Latin-1
> - capital A with tilde, OE ligature, nothing ==> Latin-9
> - 2 line drawing chars, and a Peseta symbol ==> CP-437
> - 2 line drawing chars, and an x (multiplication sign) ==> CP-850
> - small u with diaeresis, nothing or garbage ==> UTF-8
I usually use an xterm at work, but right now I'm using PuTTY on
Windows XP to login remotely from home. When I first tried the
above (after converting the hex escapes to octal), I saw the second
choice. I checked my PuTTY Window -> Translation setting and saw
that the character set was set to "ISO-8859-1:1998 (Latin-1, West
Europe)". I changed that to "UTF-8" and saw the last choice: a
small u with diaeresis followed by nothing.
I'll be in the office only briefly this week, but I'll try to run
that experiment in an xterm there and report what I see.
In the mean time, I'll take a look at the wiki.
Thanks very much for your help.
Regards,
Gary
--
Gary Johnson | Agilent Technologies
garyjohn@xxxxxxxxxxxxxxx | Wireless Division
http://www.spocom.com/users/gjohnson/mutt/ | Spokane, Washington, USA