<<< Date Index >>>     <<< Thread Index >>>

Re: inaccurate estimates of message size



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday, May 23 at 04:07 PM, quoth dv1445@xxxxxxxxx:
> Mutt (always has for me) made extremely inaccurate estimates of how 
> big my email messages are.  Often off by a factor of 5.  Why is 
> this?  Is it maybe not mutt's fault, but the IMAP servers I interact 
> with?  On the one hand, I hope it's the servers' fault, because mutt 
> is great.  On the other hand, I hope it's mutt's fault, because then 
> maybe something can eventually be done about it.

This one I don't know for sure about... but I *suspect* it might be 
neither. I actually have the opposite problem; emails get bigger when 
I download them. My guess is that it's a difference between logical 
size and actual size. The IMAP server might be reporting how big the 
message is in it's own storage (which will vary depending on the block 
size of the backend---you know, the standard problem of a 1-byte file 
being 4k big on disk) while mutt is reporting how many bytes of 
message it actually read. There's another issue there which is the 
CRLF problem. Mutt asks the IMAP server for a message's "RFC822.SIZE", 
which means that all line endings are *supposed* to be CRLF's (i.e. 2 
bytes), while when it actually downloads, mutt is reporting the size 
of what it actually received, which may not have all the right line 
endings. And it's possible that the server is under-reporting things 
(i.e. it's reporting a compressed size, or a size with wrong line 
endings) or over-reporting things (i.e. its reporting a file size).

But that's just a guess. The point is, because of line-ending issues 
and storage issues, getting the size "right" can be surprisingly 
annoying and slow, and I would imagine that many IMAP servers take 
shortcuts that result in loss of accuracy (and that the inaccuracy 
depends on the brand of server as well as the backend storage 
mechanism used).

Chances are, mutt's doing it's best to give you the most accurate 
information is has available... it just doesn't always have very 
accurate information.

> Mutt also does this with attachments.  Say I want to attach a PDF 
> that I know for a fact is of size 125K.  I attach it in the compose 
> window, and immediately mutt says that it's of size, say, 193K. 
> (Sometimes mutt adds only several K, usually lots).  What's going 
> on?  Is mutt doing something to the PDF? -gmn

THIS is actually very easy to explain. When you are browsing for 
files, mutt is telling you how big the file is on disk. When you've 
attached a file, mutt has to encode that file in a way that is safe to 
transmit in email (which has a lot of caveats). If the attachment is 
text, aside from converting line-endings, that's not going to result 
in much of a change in size. For binary files like PDFs, that 
generally means Base64-encoding the file, a process that increases the 
size of a file by approximately one-third (especially after adding all 
the right CRLF line-endings and formatting it appropriately). 
Ballooning from 125K to 193K sounds about right.

In many ways, this reinforces the claim that many make that email is a 
*lousy* file-transfer mechanism. They're absolutely right, because to 
do it safely, you have to encode your file in such a way that can be 
embedded in a message originally designed for short plain text 
messages... and that means your file is going to get *much* bigger, 
and takes time to decode and encode.

~Kyle
- -- 
Coffee is the common man's gold, and like gold, it brings to every 
person the feeling of luxury and nobility.
                                                  -- Sheik Abd-al-Kadir
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!

iEYEARECAAYFAkg3QPoACgkQBkIOoMqOI15NAACglL6lH+hTpx4Vts2umyvDYdJ4
P8kAoL4LuDiszjXc5EtH8ycdIFAipyCp
=4ZVd
-----END PGP SIGNATURE-----