<<< Date Index >>>     <<< Thread Index >>>

Re: Smarter send_charset



On Tue, Sep 13, 2005 at 01:05:28AM -0400, Ryan King wrote:
> On Tue, Sep 06, 2005 at 08:00:19PM +0200, Lionel Elie Mamane wrote:
>> On Mon, Sep 05, 2005 at 11:09:54PM -0400, Ryan King wrote:

>>> (...) all bitpatterns from the smallest 0x00 to the grandest 0xFF
>>> are valid ISO-8859-1 (as far as I know).

>> Nope. The ranges 0-31 and 127-159 are not valid iso-8859.

> Thanks for your reply.  I see that in the RFC, but I still don't
> understand why "head /dev/urandom | iconv -f iso-8859-1" doesn't
> complain.

Reading further, and in particular
http://en.wikipedia.org/wiki/ISO_8859-1, I see that I had confused ISO
8859-1 (also called ISO/IEC 8859-1 or Latin-1), and ISO standard
comaintained by IEC, and ISO-8859-1, an IANA standard. Which are two
different things. ISO 8859-1 doesn't define ranges 0-31 and
127-159. ISO-8859-1 defines them as "control characters". See
http://en.wikipedia.org/wiki/ISO_8859-1#ISO-8859-1 for a table.

It seems that these same control characters are defined in Unicode,
too: http://www.unicode.org/charts/PDF/U0080.pdf and
http://www.unicode.org/charts/PDF/U0000.pdf

> Is it just making a common-sense compromise (instead of being strict for no
> profitable reason)?

It implements ISO-8859-1, not ISO 8859-1. ($DEITY, how confusing!)

-- 
Lionel