<<< Date Index >>>     <<< Thread Index >>>

Re: e-mail encoding/formatting (was Re: Split-screen mode in mutt?)



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday, May  1 at 01:45 PM, quoth Derek Martin:
> On Mon, May 01, 2006 at 01:06:38PM -0400, Kyle Wheeler wrote:
>> Heh, much to the detriment? Meh. 
>
> It's nice to speak English, isn't it?  You can just disregard everyone 
> else's encoding difficulties, blissfully ignorant of the hastle that 
> you cause...

And you know what else, using complicated words like "blissfully" and 
"ignorant" can also be confusing to people who don't speak English 
very well (heck, using English at all is confusing to people who don't  
speak English). Regardless, both of us are likely to continue to use 
any and all English words in our vocabularies that seem appropriate, 
regardless of what hassle that may cause for people who do not speak 
English and therefore cannot understand us.

Now, maybe you can make the argument that using UTF-8 encoding causes 
additional, unnecessary hassles, when I could instead be using an 
older standard. But of course, the same would be true if I was sending 
email from an older IBM machine that could only speak EBCDIC. Encoding 
"hassles" have been a problem since the dawn of the internet (and even 
a little before that). Unicode is the *solution* to the hassles, and 
the sooner people can be convinced to start using it, the better it is 
for all concerned.

>> I'm encouraging those who use good mail clients (like mutt) to set 
>> them up in a UTF-8-using way! :) 
>
> A very large percentage of computer users use operating systems that 
> are still not 100% Unicode functional, making their switching to any 
> Unicode locale problematical at best, or very likely entirely 
> impossible.  A great many of those have little or no control over what 
> they are using.  So you're making trouble for potentially a great many 
> people.

The text I write is not unintelligible. The complaints have been along 
the lines of "why am I getting ??? in my emails where the quotes 
should be?" ; forgive me, but this hardly seems like a huge problem 
for people trying to read it who are unable to use software that 
supports Unicode characters. At best, it's a subtle prod for that user  
to learn to configure their system in a more modern way, and at worst 
it gives the impression that I have a question when I do not.

Let's keep in mind that for mutt users (which, given the name of this 
list, should be most people who actually read the email) who are 
forced to use computer systems that do not support modern character 
sets, this is a matter of simply adding "//TRANSLIT" to the end of 
their $charset setting.

>> And on top of that, it's good typography. Quotes have a history of 
>> "correct" usage starting LONG before someone decided to cut corners 
>> and only have straight quotes available. 
>
> Who cares?  Everyone recognizes straight quotes for what they are, and 
> they cause no one problems that I have ever heard about.  Curly quotes 
> do.  Languages evolve and change, and computers have made the straight 
> quote ubiquitous over the last 40 years.  It's like you're Lady Jane, 
> demanding that the value of a quid return to some historic 
> pre-inflation value...

No, I'm not insisting on the economically impossible, wishing for 
something that could never really happen. I am insisting on using an 
existing feature of existing technology, nothing more.

> The Chinese used the same complicated characters for thousands of 
> years, and then scholars decided to simplify them to make things 
> easier for the masses.  Straight quotes are an example of the same 
> impetus in action, albeit with a vastly smaller impact.

The reason the Chinese simplified their character set was because the 
masses could not understand it (you had to have years of training to 
begin to recognize them all). The same is not true of curly quotes. 
Curly quotes have remained in use in virtually every 
non-computer-display form. Get a book, a newspaper, a magazine; they 
all use curly quotes. Straight quotes are the result of a 
technological limitation for a particular method of communication; a 
limitation that has been (for the most part) removed by widespread 
support for Unicode. The simplification of the Chinese language was a 
widespread, calculated revision that was used in nearly all forms of 
written communication in order to make the language usable by more 
than just the scholars who had dedicated their lives to the study of 
the language (I say nearly all, because the original set was and is 
still used by some scholars).

>> Technology has finally gotten around to providing some of the more 
>> basic features of the Gutenberg printing press. I think this is a 
>> great thing.
>
> I think you mean obscure and obsolete...

Unicode is hardly obscure or obsolete.

>>> Even better, some Windows applications use this encoding and 
>>> incorrectly label the resulting data as iso-8859-1.  Extremely 
>>> annoying.
>> 
>> Eh, add a "charset-hook iso-8859-1 windows-1252" to your muttrc and 
>> breathe deeply of the peace of mind.
>
> Which only works if your system actually knows aobut that character 
> set.  Lots and lots of Unix systems do not.  IIRC even older Windows 
> systems don't, and DOS definitely doesn't.  All of these are still 
> in surprisingly wide use, believe it or not.

How many DOS systems are used for reading email? How many DOS systems 
does mutt run on?

For machines that will run mutt, this is precisely a problem that 
libiconv was created to solve.

> Another common Microsoft (or just any webmail) brain-death is 
> mislabeling virtually every encoding as us-ascii.  A lot of 
> non-English speaking people already have to use Mutt's charset hooks 
> just to get their own language displayed properly, so this trick 
> won't work for them...

Sure it can, if only in an expanded form. But like anyone giving good 
advice, I was giving you the simplest advice appropriate to what I 
understood your situation to be. If I believed you were someone who 
had trouble getting his own language displayed properly without the 
use of charset-hooks, I would have suggested something different. For 
example, compound hooks (hooks that create hooks) can be used to get 
the charset to display properly as long as there is something else in 
the headers (such as a User-Agent or a From header) that can be used 
to distinguish senders who do not encode their email correctly.

> It's been brought up twice in the last month; that should make it 
> clear that you are, in fact, causing difficulties for people.  For 
> every person who speaks up, how many do not?

This is a good argument.

An even better argument to make would be to say that mutt-users is, to 
a large extent, a mailing list for people who are having trouble with 
their email. As such, emails to this list should be in a form that has 
the greatest probability of working (or at least, being readable) by 
people with broken email readers. While a somewhat less-broken setup 
may merely replace the curly quotes with ???, a very broken setup may 
refuse to display the email at all.

And I agree, which is why this, and all future emails from me to this 
list, will be in 7-bit us-ascii.

I still think that, given the opportunity, people should take the time 
to support and use Unicode, and that using it in general will prompt 
more widespread adoption of that solution.

~Kyle

P.S. If you have problems with modern encodings, why do you use 
PGP/MIME? Pine users, for example cannot use that form of signature 
without a great amount of trouble, nor can many people who use much 
older mailers that do not support MIME encoding of any sort.
- -- 
Scientists have shown that the moon is moving away at a tiny, although 
measurable distance from earth every year. If you do the math, you can 
calculate that 65 million years ago, the moon was orbiting at a 
distance of about 35 feet from the earth's surface. This would explain 
the death of the dinosaurs . . . the tallest ones, anyway.
                                                            -- Unknown
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!

iD8DBQFEVowWBkIOoMqOI14RAiN5AJ0ZR2dsdyTqMcDAGNaaRugFgY7kFACfe8Pv
/8HaDnX7BGODIF5DQq74H0Y=
=pCWK
-----END PGP SIGNATURE-----