<<< Date Index >>>     <<< Thread Index >>>

Re: e-mail encoding/formatting (was Re: Split-screen mode in mutt?)



Thus spake Kyle Wheeler on Fri, May 05, 2006 at 03:08:04PM -0400 or 
thereabouts: <kyle-mutt@xxxxxxxxxxxxxx> [2006-05-05 17:21]:
> On Thursday, May  4 at 11:40 PM, quoth cga2000:
> > Well, things may not be that trivial after all. With my new setup 
> > Kyle's curly quotes are displayed correctly. The message above does 
> > not.
> 
> Now that's just weird.
> 

could be a fluke of sorts.. keep in mind that I am experimenting and
may be doing some very creative stuff that seasoned mutt users would
never even imagine..! :-)

> > 1. My attempt at curly quotes is rendered by displaying what I 
> > assume is the three-byte values of the character in the UTF-8 
> > encoding:
> > x'e2809c'and x'e2809d'. Mutt's - or whatever's - rendering is actually 
> > an equal sign followed by the hex value of the first byte, followed by 
> > another '=' followed by the hex value of the second.. etc.

double-checked the raw contents using the od command and what I see in
mutt (both the internal pager and vim) is definitely a transcription of
the three-byte encoding of each character.

now a corollary would be that for some reason mutt/vim are not
recognizing the UTF-8 encoding for this particular message, or am I
mistaken?

> 
> That's what's known as "quoted-printable" encoding 
> (http://en.wikipedia.org/wiki/Quoted-printable). Normally, mutt should 
> transform that into the encoded bytes for you. I have no idea why it 
> wouldn't... And I have no idea why it would work for my email but not 
> yours, since our emails are encoded in exactly the same way. You may 

meaning that this same message we are talking about displays correctly
your end?

> have found a mutt bug of some kind...

but then it only seems to affect this particular message on my hard
drive. Could it be that this message was somehow "damaged" on the way
back to me (from the list, I mean) .. possibly due to something in my
headers relative to content-type.. encoding.. etc..

> 
> Try this experiment: send yourself a message with curly-quotes in it, 

already did that - naturally.. and sure enough, the version that came
back display ok.

> then send yourself another one with an attachment (doesn't matter 
> what, just so long as it's an attachment). 

.. you mean the same message with just the curly quotes and an
additional attachment.. sorry I'm slow but I'm just too new to mutt to
guess what you are driving at.

and also <blush> how do I do that - send an attachment? </blush>

> Is there a difference? 
> (There *shouldn't* be, but the attachment is the only thing I can 
> think of that's different between our two emails.)
> 
> > 2. There are a bunch of =20 artefacts in my email, some 
> > corresponding to the first '.' of my personal rendering of the 
> > ellipsis using what en.US has to offer - I just type two dots like 
> > so: '..' uses less space than three dots..
> 
> That goes back to the quoted-printable thing; 's just the way it does 
> it. What matters here is that for whatever reason, mutt isn't decoding 
> it for you.

I think I still have the message on my hard drive. Maybe there's
something in the raw message that's different from your version and
causes mutt (and vim) to switch to that '= plus byte content'
transliteration. 

If I view the message in mutt, type 'L' to edit it, write to a file in
/tmp, and then start vim on this file I see the same thing.. the above
in lieu of the curly double quotes as well as a few =20 and =2D. So both
vim and mutt are no longer quite capable of figuring out the space and
the dot - both 1-byte characters - in a UTF-8 context, at least where
this message is concerned (*and* on my system as it is currently set
up). 

So it sounds like some common component between the two is involved,
likely a particular function in whatever lib takes care of this on
their behalf?

> 
> > Same behavior in linux console (after unicode-start) and xterm -e8.
> 
> ...you mean xterm -u8, right?

indeed.. :-/ 

> 
> You may also want to try:
> 
>     env LC_CTYPE=en_US.UTF-8 xterm -u8
> 
> > Interestingly, the above message displays correctly in mozilla-mail.
> 
> Because it should! :)
> 

hehe.. 

sounds to me as if there is something in this message that confuses the
routine that does the transliteration on behalf of mutt, vim, less, od,
etc.. and the either mozilla does not use the same lib or a different
version. 

> > I don't know if anyone is seeing what I am seeing but does anyone 
> > guess at what the problem could be? 
> 
> Wish I could be of more help - your messages all look fine to me.
> 

no big deal. I'll keep an eye out for possible recurrences.. see if I
can detect a pattern. Pretty sure I could crack this one easily if I was
somewhat familiar with UTF-8 encoding and its linux implementation. 

I could be wrong but I have a feeling my mutt mbox contains the 8-bit
representation of the '=' '80' '9C' '9D' etc. while the mozilla mailbox
does not. Therefore the message was corrupted from here to there and
back again. The only difference between the mutt -> list -> mozilla and
the mutt -> list -> mutt paths would have to be something on my
system.. fetchmail or procmail, maybe.

Thanks much for help.

cga