<<< Date Index >>>     <<< Thread Index >>>

Re: e-mail encoding/formatting (was Re: Split-screen mode in mutt?)



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday, May  8 at 10:53 PM, quoth cga2000:
>> vim, less, od, etc. do not decode quoted-printable encoding. They 
>> edit/view files just as they are.
>> 
> not the quoted-printable ("electronic mail") encoding.. but vim & 
> less at least would necessarily be able to handle UTF-8 encoding. 
> Hence my using "od" to display the hex contents of the file/message. 
> I thought this was the only way I could visualize it without any 
> rendering software tampering with it.
>
> Or am I missing something?

Well, the thing is that whether your mail contains actual UTF-8 bytes 
or whether your mail contains only 7-bit ascii bytes determines what 
vim sees.

Put another way: the actual email that gets sent, in it's raw form, is 
in "quoted-printable" encoding. That means that it actually contains 
the ascii strings that look like "=20" and "=E2=80=99". That is not 
UTF-8 encoding, that's just a string of ascii characters. If you edit 
the raw mail file, that's what vim sees.

Mutt, however, knows that they are quoted-printable encodings for 
bytes, and as such represent things, and so can transform the "raw" 
(or "undecoded") form into a "decoded" form where things like 
=E2=80=99 are replaced with bytes with the hexadecimal values 
indicated. If the "decoded" form is saved to a file, vim will see 
those bytes and will be able to interpret them as a UTF-8-encoded 
character.

In other words, if you run vim/od/less/whatever on the "undecoded" 
("raw") email, you will see the quoted-printable strings rather than 
the UTF-8 characters. If, on the other hand, you run 
vim/od/less/whatever on the "decoded" email, you will see the UTF-8 
bytes. This "decoding" is what is controlled, for example, by mutt's 
$pipe_decode variable. If you set pipe_decode and then pipe a message 
to (for example) less, you should see the three bytes with those 
values. If you unset pipe_decode and then pipe a message to less, you 
should see the nine-byte string "=E2=80=99" (for example) instead.

Mutt stores all mail in "raw" or "undecoded" format, and decodes it 
every time you view it. So, saving mail to a file and then editing 
that file will show you the long =E2=80=99 form.

Does that help?

~Kyle
- -- 
A great many people think they are thinking when they are actually 
rearranging their prejudices.
                                                      -- William James
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!

iD8DBQFEYRIxBkIOoMqOI14RAi59AKDv4mt+5wG/0q9n5aI5Dq38pLWYJACfbO/a
iUote+sWckgYKfPkKi77Jq4=
=Cwjf
-----END PGP SIGNATURE-----