Re: Different encodings at index and pager views
- To: mutt-users@xxxxxxxx
- Subject: Re: Different encodings at index and pager views
- From: Kyle Wheeler <kyle-mutt@xxxxxxxxxxxxxx>
- Date: Thu, 26 Feb 2009 17:50:00 -0600
- Comment: DomainKeys? See http://domainkeys.sourceforge.net/
- Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=memoryhole.net; h=date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; s=default; bh=v2Blv6+7JFcT5SVjh76zvj2Klas=; b=Il/s PfIYlCxk+m3CK1gw14A9n5EYNMjL26xJk1dlE2tBMAoJnBa8oROMe/No8iAj5It/ kaua3e7hgrGfzFOoD0rzfjBrR40gkqyZlZlQd+Zz9ApZTwz0O5CwnoAouoKTu9db S+wHktLGaYQkpnyjYrxtWyrg05R7CubfPjSmWUc=
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=memoryhole.net; b=JpW1gEgORcWvo+Qu1fWMxEFB8galumRKOjs+z3hiFuHPIe9xanFo8slb/u9i48eT2/+RiGW4Nc9139xD8OUWBVKlyt3XO6G62hp0Zl9DeQLjq6f7F9To8o5Ep1SgfepKjoS9akOJbTKBw1j8ud4w7ASd79YSAyk14H5WiwK+dV4=; h=Received:Received:Date:From:To:Subject:Message-ID:Mail-Followup-To:References:MIME-Version:Content-Type:Content-Disposition:In-Reply-To:OpenPGP:User-Agent;
- In-reply-to: <20090226220720.GA12471@carlos>
- List-post: <mailto:mutt-users@mutt.org>
- List-unsubscribe: send mail to majordomo@mutt.org, body only "unsubscribe mutt-users"
- Mail-followup-to: mutt-users@xxxxxxxx
- Openpgp: id=CA8E235E; url=http://www.memoryhole.net/~kyle/kyle-pgp.asc; preference=signencrypt
- References: <20090226220720.GA12471@carlos>
- Sender: owner-mutt-users@xxxxxxxx
- User-agent: Mutt/1.5.19 (2009-01-27)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thursday, February 26 at 08:07 PM, quoth Carlos Pita:
> I'm noticing different encoding behavior for headers displayed at
> the index view and those at the pager view for the same email. For
> example, at the index view I can see subjects like: "?Opin? sobre
> Bebidas Alcoh?licas y gan? una TV LCD o una Notebook!", while the
> right chars show instead of ? at the pager view.
Strange!
> First, I want to discard a number of usual causes:
>
> * It's not a cache header issue. I deleted the cache after each related
> configuration change.
Fair enough.
> * I'm accessing to gmail via imap
Ahhh, fun. Could be trouble.
> but it's not a case of bad encoding by them.
Are you sure? Compile mutt with debugging (configure with the
- --enable-debug flag) and get a debug trace. That will show you EXACTLY
what the Gmail server sent you, so that we can be certain whether this
is a mutt problem or a Gmail problem. It wouldn't be the first time
that Gmail has gotten encoding issues completely wrong with their IMAP
service.
Here's something to consider: the index display and the pager display
are the results of different IMAP commands. The index is generated by
asking the server for specific headers (e.g. the subject header). The
pager is generated by asking the server for the body of the email,
which is then parsed by mutt. Its entirely possible for the IMAP
server to give different answers to the two questions.
> True, the headers I'm receiving are not 2047 encoded, they are just
> iso-8859-1 or utf-8 encoded, but in theory I'm forcing their charset
> by means of assumed_charset (more on this below).
You're not *forcing* the charset, you're *guessing* the charset.
There's a semantic difference (of course, we computer folk love to
have semantic arguments, so... ;)
The thing to keep in mind here is that non-ascii characters in email
headers are *FORBIDDEN*. They're not just a bad idea, they're flat
illegal. Of course, idiotic email programs (and spammers) still
generate them, but my point is that you're dealing with some *broken*
email here. Every piece of software that touches this email gets to
make its own decisions about how to handle it, and the answers don't
always line up in your favor.
> * I made some tests with my locale configured to en_US.ISO-8859-1
> and then to en_US.UTF-8.
I presume your terminal is capable of understanding UTF-8 characters?
> I also tested disabling muttrc charset setting, and forcing it to my
> current locale, whatever it were. It did no difference at all.
Maybe not, but generally speaking, it's a very VERY bad idea to set
the $charset manually (unless you really know what you're doing - I
know of only one situation where it's even useful, much less a good
idea). Mutt is very good at figuring out the correct charset to use;
to the point that if mutt guesses wrong, then your system libraries
are pretty much guaranteed to be giving you incorrect answers.
> To make things weirder, some non 2047 encoded headers are shown
> correctly at both views. For example, I have a utf-8 email and a latin-1
> email, both with their subject headers encoded in the respective charset
> (I verified this editing the raw emails with e). No matter what my
> locale is the utf-8 email subject is correctly displayed while the
> latin-1 one isn't. Also assumed_charset=iso-8859-1 doesn't fix the
> problem for the latin-1 message.
Hmmm. Interesting. This is from the Gmail server?
Sounds like the problem *may* be with the Gmail server. I think you
need to make *sure* that it's not.
> This begins to feel like random behavior but there is another aspect
> that could be making the difference: the email that is looking bad is a
> multipart one, with no charset specified at the main Content-Type:
> multipart/alternative header; but the well behaved email is single part,
> with Content-Type: text/plain; charset=UTF-8.
You're right, that's probably the difference. That also suggests a
Gmail issue (because mutt doesn't use that header to learn about the
possible header character sets).
Anyway - step 1 is to find out *exactly* what the conversation between
mutt and gmail looks like.
Once you've compiled mutt with the debugging support, run mutt with
the '-d5' flag. That will create the file ~/.muttdebug0, which will be
very verbose, but buried in there is the verbatim IMAP conversation.
If you can't isolate the part relating to these messages, delete your
username and password information out of there and post that file
somewhere where we can see it.
~Kyle
- --
We act as though comfort and luxury were the chief requirements of
life, when all that we need to make us really happy is something to be
enthusiastic about.
-- Charles Kingsley
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!
iEYEARECAAYFAkmnKqMACgkQBkIOoMqOI14rdwCgnuv7GiX6LMEkhsVAZyYEOFTf
qMgAoOZ8iiorDab/YV7uDy6yc7J27QVy
=bpcx
-----END PGP SIGNATURE-----