<<< Date Index >>>     <<< Thread Index >>>

utf8 file corruption after transmission over email



Hello,
I have a mystery that I'm trying to solve to no avail. mutt-1.5.19 is
running on OpenBSD 4.5, "--with-idn". I got a little sample XML (utf-8)
encoded file that I'm trying to send as attachment. When I attach it,
mutt correctly identifies it: [text/plain, 8bit, utf-8, 0.3K], since
there are non-ASCII characters, in this case there is only 1 such
character. After I send it, this attached file becomes currupt. 

before I send it: MD5 (hw2.jff) = 9a14b9ac1a12deb07d262b6658d7b9b2
after:  MD5 (hw2.jff-corrupted) = 718597b09e7544f89cd255a5d4c8e301

this file contains a 'WHITE SQUARE' character, see
http://www.fileformat.info/info/unicode/char/25a1/index.htm

after examining both files here is what gets changed:
"0xE2 0x96 0xA1" becomes "0xD0 0x91 0xE2 0x88 0x9A 0xE2 0x95 0x91"

Both files are available for your inspection at:
http://www.x96.org/files/hw2.jff
http://www.x96.org/files/hw2.jff-corrupted

I appreciate your help.