Re: Charset issue?
- To: Mutt-Users <mutt-users@xxxxxxxx>
- Subject: Re: Charset issue?
- From: Kyle Wheeler <kyle-mutt@xxxxxxxxxxxxxx>
- Date: Wed, 2 May 2007 15:27:40 -0600
- Comment: DomainKeys? See http://domainkeys.sourceforge.net/
- Dkim-signature: v=0.5; a=rsa-sha1; c=relaxed; d=memoryhole.net; h=received:received:date:from:to:subject:message-id:mail-followup-to:references:mime-version:content-type:content-disposition:content-transfer-encoding:in-reply-to:user-agent; q=dns/txt; s=default; bh=+77rrOhM46oc45uRFte0ZRzHotM=; b=LRNZkQJ8hBmBZmFsuwpuwOIMFbSC1N/ZG2RGonT/9mdMwD2Ub4VafFILSJANGVZKROMiI6sx5DNuI4WDuyikLKRUrrpyd0XB7+bTulQ0RtBNcTBuDvkkPEBTd2Ol59nCyKgKW/KvAlMxjabb8tFFbeiuXvGSV7l/zkAatAI4NUw=
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=memoryhole.net; b=BcHNt0Xj/73ZDQRHkAnG4UJ3AKOkXOej/j9OPYsxLMy35/2VgwPVR3UmAYqONykTOQDUJgsyzJL32OdgcrOkqYEPTcAu3cr4mRMBoNQ50hdljirsnupLhFOVGmfhhFd240wRVnuycvflEt7AhtDWK/xFF55eYtU90wqQ9/SB1I0=; h=Received:Received:Date:From:To:Subject:Message-ID:Mail-Followup-To:References:MIME-Version:Content-Type:Content-Disposition:Content-Transfer-Encoding:In-Reply-To:User-Agent;
- In-reply-to: <20070502210423.GA308@xxxxxxxxxxxxx>
- List-unsubscribe: <mailto:mutt-users-request@mutt.org?body=unsubscribe>
- Mail-followup-to: Mutt-Users <mutt-users@xxxxxxxx>
- References: <20070502201022.GA32476@xxxxxxxxxxxxx> <20070502204027.GB21230@xxxxxxxxxxxxx> <20070502210423.GA308@xxxxxxxxxxxxx>
- Sender: owner-mutt-users@xxxxxxxx
- User-agent: Mutt/1.5.15 (2007-04-29)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thursday, May 3 at 09:04 AM, quoth Roland Hill:
>> On Thursday, May 3 at 08:10 AM, quoth Roland Hill:
>>> I thought I had found a solution in the archives and added TRANSLIT
>>> as
>>> follows:
>
>>> set charset="iso-8859-1//TRANSLIT"
>
>>> .....but this hasn't fixed the problem.
>
>>> Can anyone offer a solution or point me somewhere where I can work it out?
>
>> What is the charset of the email you're looking at? It may be
>> mislabelled (which you may be able to use charset_hook to fix).
>
> utf-8
Hmm, then it's *definitely* mislabelled. REAL utf-8 encodes the single
curly quote as THREE bytes: 0xE2 0x80 0x99 (or, in mutt's octal
format: \342 \200 \231). What you got there (\204, or, in hex, 0xB4)
maps to the grave character (´) in Latin1 (aka ISO-8859-1), which
looks kinda-sorta like a curly-quote in the appropriate direction
(though it's really a rather inappropriate character to use there).
The other character you mentioned, \250 (or 0xA8) maps to the umlaut
dots (¨) in Latin1, which looks ever-so-slightly like double-quote
marks (but again, totally inappropriate).
I'm not sure what other characters those might actually be referring
to, but given the visual similarity to the characters they *should* be
referring to, I think that's probably pretty close to what's
happening.
The next question is: what can you do about it? Fixing them, and
replacing them with the correct characters may not be possible, but
getting them to display *at all* can be done, I think. A charset hook
could help... but you don't want to break ALL utf-8 mails. Perhaps
something like this (haven't tested it, and it would break other
charset-hooks you may be using):
message-hook . "unhook charset-hook"
message-hook "~f thatguy" "charset-hook utf-8 iso-8859-1"
Alain may have a better solution, though.
~Kyle
- --
These are the times that try men's souls. The summer soldier and the
sunshine patriot will, in this crisis, shrink from the service of
their country; but he that stands it now, deserves the love and thanks
of man and woman.
-- Thomas Paine
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!
iD8DBQFGOQJMBkIOoMqOI14RAuDYAJ4hVwUBW4EqwJ5S4JHWB7DvGqqyzwCfQcRs
/gugO6AHUWNCdB1IG5vV+cw=
=iQHq
-----END PGP SIGNATURE-----