<<< Date Index >>>     <<< Thread Index >>>

Re: apostrophes showing as ?



On Thu, Apr 06, 2006 at 05:26:37PM +0100, David Woodfall wrote:
> I've tried Aterm, Xterm and a normal VT by switching from X with
> ctrl-alt-F6. I probably need to read up on charsets I guess.

If you're not already using a UTF-8 locale, in addition to the many
fine examples, you could try:

  LANG=en_US.UTF-8 xterm -fn 
'-misc-fixed-medium-r-semicondensed-*-13-*-*-*-*-*-iso10646-*'

(or if you're on a debian-based system, LANG=en_US.utf8 ...)

You'll also need to set LANG again in the shell which xterm starts, so
that your terminal's and environment's locales match.

The chosen font is equivalent to "fixed" or "6x13", except that in
addition to all the iso-8859-1 glyphs, it contains most (if not all)
of the entire set of usable UTF-8 glyphs.  If that works for you, you
can make it your default by adding this line to your .Xdefaults file:

  XTerm*font: -misc-fixed-medium-r-semicondensed-*-13-*-*-*-*-*-iso10646-*

Note that if you normally don't use UTF-8, but you do use non-ASCII
characters in file names, Linux (and possibly other operating systems)
will generally not be able to display old filenames of files created
under the other encoding.  Personally I think this is one design
decision that Microsuck got right (sort of)...  They store the
filename on disk in a known encoding, so that if you're using a
different encoding, it's easy to convert and get the right thing
(assuming the characters exist in both encodings).

FWIW, IMO cp1252 should not be used (i.e. one should never send mail
or write web pages encoded in that encoding).  I believe it is not
universally available, and users of non-Windows systems usually need
to configure their systems/applications in strange ways (which may be
hard to even find out about) in order to display characters from that
encoding in a useful way...  Are curly quotes really that important?
I think not.

Although, I don't think it's a very strong argument...  I think
everyone should use Unicode (preferably UTF-8), because it enables
truly universal communication (i.e. someone who has the ability can
read and write e-mail which contains French, Russian, English (even
cp1252), Japanese, and Korean all in the same e-mail).  With any other
encoding, this is basically impossible AFAIK.  However, some of the
same arguments I made against cp1252 apply to UTF-8 as well...  The
main difference being cp1252 can still only represent Western European
languages, so it loses.  UTF-8 can essentially represent every known
language (AFAIK), so its technical superiority outweighs the other
issues IMO.

HTH

-- 
Derek D. Martin    http://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail.  Sorry for the inconvenience.  Thank the spammers.

Attachment: pgp21u2VlaUTT.pgp
Description: PGP signature