Re: e-mail encoding/formatting (was Re: Split-screen mode in mutt?)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Monday, May 1 at 01:45 PM, quoth Derek Martin:
> On Mon, May 01, 2006 at 01:06:38PM -0400, Kyle Wheeler wrote:
>> Heh, much to the detriment? Meh.
>
> It's nice to speak English, isn't it? You can just disregard everyone
> else's encoding difficulties, blissfully ignorant of the hastle that
> you cause...
And you know what else, using complicated words like "blissfully" and
"ignorant" can also be confusing to people who don't speak English
very well (heck, using English at all is confusing to people who don't
speak English). Regardless, both of us are likely to continue to use
any and all English words in our vocabularies that seem appropriate,
regardless of what hassle that may cause for people who do not speak
English and therefore cannot understand us.
Now, maybe you can make the argument that using UTF-8 encoding causes
additional, unnecessary hassles, when I could instead be using an
older standard. But of course, the same would be true if I was sending
email from an older IBM machine that could only speak EBCDIC. Encoding
"hassles" have been a problem since the dawn of the internet (and even
a little before that). Unicode is the *solution* to the hassles, and
the sooner people can be convinced to start using it, the better it is
for all concerned.
>> I'm encouraging those who use good mail clients (like mutt) to set
>> them up in a UTF-8-using way! :)
>
> A very large percentage of computer users use operating systems that
> are still not 100% Unicode functional, making their switching to any
> Unicode locale problematical at best, or very likely entirely
> impossible. A great many of those have little or no control over what
> they are using. So you're making trouble for potentially a great many
> people.
The text I write is not unintelligible. The complaints have been along
the lines of "why am I getting ??? in my emails where the quotes
should be?" ; forgive me, but this hardly seems like a huge problem
for people trying to read it who are unable to use software that
supports Unicode characters. At best, it's a subtle prod for that user
to learn to configure their system in a more modern way, and at worst
it gives the impression that I have a question when I do not.
Let's keep in mind that for mutt users (which, given the name of this
list, should be most people who actually read the email) who are
forced to use computer systems that do not support modern character
sets, this is a matter of simply adding "//TRANSLIT" to the end of
their $charset setting.
>> And on top of that, it's good typography. Quotes have a history of
>> "correct" usage starting LONG before someone decided to cut corners
>> and only have straight quotes available.
>
> Who cares? Everyone recognizes straight quotes for what they are, and
> they cause no one problems that I have ever heard about. Curly quotes
> do. Languages evolve and change, and computers have made the straight
> quote ubiquitous over the last 40 years. It's like you're Lady Jane,
> demanding that the value of a quid return to some historic
> pre-inflation value...
No, I'm not insisting on the economically impossible, wishing for
something that could never really happen. I am insisting on using an
existing feature of existing technology, nothing more.
> The Chinese used the same complicated characters for thousands of
> years, and then scholars decided to simplify them to make things
> easier for the masses. Straight quotes are an example of the same
> impetus in action, albeit with a vastly smaller impact.
The reason the Chinese simplified their character set was because the
masses could not understand it (you had to have years of training to
begin to recognize them all). The same is not true of curly quotes.
Curly quotes have remained in use in virtually every
non-computer-display form. Get a book, a newspaper, a magazine; they
all use curly quotes. Straight quotes are the result of a
technological limitation for a particular method of communication; a
limitation that has been (for the most part) removed by widespread
support for Unicode. The simplification of the Chinese language was a
widespread, calculated revision that was used in nearly all forms of
written communication in order to make the language usable by more
than just the scholars who had dedicated their lives to the study of
the language (I say nearly all, because the original set was and is
still used by some scholars).
>> Technology has finally gotten around to providing some of the more
>> basic features of the Gutenberg printing press. I think this is a
>> great thing.
>
> I think you mean obscure and obsolete...
Unicode is hardly obscure or obsolete.
>>> Even better, some Windows applications use this encoding and
>>> incorrectly label the resulting data as iso-8859-1. Extremely
>>> annoying.
>>
>> Eh, add a "charset-hook iso-8859-1 windows-1252" to your muttrc and
>> breathe deeply of the peace of mind.
>
> Which only works if your system actually knows aobut that character
> set. Lots and lots of Unix systems do not. IIRC even older Windows
> systems don't, and DOS definitely doesn't. All of these are still
> in surprisingly wide use, believe it or not.
How many DOS systems are used for reading email? How many DOS systems
does mutt run on?
For machines that will run mutt, this is precisely a problem that
libiconv was created to solve.
> Another common Microsoft (or just any webmail) brain-death is
> mislabeling virtually every encoding as us-ascii. A lot of
> non-English speaking people already have to use Mutt's charset hooks
> just to get their own language displayed properly, so this trick
> won't work for them...
Sure it can, if only in an expanded form. But like anyone giving good
advice, I was giving you the simplest advice appropriate to what I
understood your situation to be. If I believed you were someone who
had trouble getting his own language displayed properly without the
use of charset-hooks, I would have suggested something different. For
example, compound hooks (hooks that create hooks) can be used to get
the charset to display properly as long as there is something else in
the headers (such as a User-Agent or a From header) that can be used
to distinguish senders who do not encode their email correctly.
> It's been brought up twice in the last month; that should make it
> clear that you are, in fact, causing difficulties for people. For
> every person who speaks up, how many do not?
This is a good argument.
An even better argument to make would be to say that mutt-users is, to
a large extent, a mailing list for people who are having trouble with
their email. As such, emails to this list should be in a form that has
the greatest probability of working (or at least, being readable) by
people with broken email readers. While a somewhat less-broken setup
may merely replace the curly quotes with ???, a very broken setup may
refuse to display the email at all.
And I agree, which is why this, and all future emails from me to this
list, will be in 7-bit us-ascii.
I still think that, given the opportunity, people should take the time
to support and use Unicode, and that using it in general will prompt
more widespread adoption of that solution.
~Kyle
P.S. If you have problems with modern encodings, why do you use
PGP/MIME? Pine users, for example cannot use that form of signature
without a great amount of trouble, nor can many people who use much
older mailers that do not support MIME encoding of any sort.
- --
Scientists have shown that the moon is moving away at a tiny, although
measurable distance from earth every year. If you do the math, you can
calculate that 65 million years ago, the moon was orbiting at a
distance of about 35 feet from the earth's surface. This would explain
the death of the dinosaurs . . . the tallest ones, anyway.
-- Unknown
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!
iD8DBQFEVowWBkIOoMqOI14RAiN5AJ0ZR2dsdyTqMcDAGNaaRugFgY7kFACfe8Pv
/8HaDnX7BGODIF5DQq74H0Y=
=pCWK
-----END PGP SIGNATURE-----