Re: inaccurate estimates of message size
- To: mutt-users@xxxxxxxx
- Subject: Re: inaccurate estimates of message size
- From: Kyle Wheeler <kyle-mutt@xxxxxxxxxxxxxx>
- Date: Fri, 23 May 2008 17:11:06 -0500
- Comment: DomainKeys? See http://domainkeys.sourceforge.net/
- Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=memoryhole.net; h=date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; s=default; bh=dHUvHtNZgSr4vVKTo0YJc5aodIw=; b=XzbN ws71ZKdsEbG2W6Xkb4c7nDrN7VQf905N+slFGdNTOCSMw11r46reXeiNC1uneB4Z 5bMVYiQia/lvFJdC11D4m6Usk/+HGbheLKAKANUT3KRZdPBycP5nQsPvjbaNXJZq ASrrSB7KzqnflMFkrWy5fy+CCSps+19OLNgEgRY=
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=memoryhole.net; b=J8quTMp8dZt5eUkjE6/kUlioAAkPO1QNGtPC9pSsd5DrLmpPHrerbGdW3Kn2FsBd4EHsfm8sRc7K3buZEeQ5319F6xSn9KwBFa4OInqX+nJki9b+Vj0k4EWNZptpf/aIiPlp0EatwAgHjXdvQf8Vmoq4rHvYvTg6H3lm1w/O5nw=; h=Received:Received:Date:From:To:Subject:Message-ID:Mail-Followup-To:References:MIME-Version:Content-Type:Content-Disposition:In-Reply-To:OpenPGP:User-Agent;
- In-reply-to: <20080523200743.GB17412@xxxxxxxxxxxxxxxxxxxxxxxx>
- List-post: <mailto:mutt-users@mutt.org>
- List-unsubscribe: send mail to majordomo@mutt.org, body only "unsubscribe mutt-users"
- Mail-followup-to: mutt-users@xxxxxxxx
- Openpgp: id=CA8E235E; url=http://www.memoryhole.net/~kyle/kyle-pgp.asc; preference=signencrypt
- References: <20080523200743.GB17412@xxxxxxxxxxxxxxxxxxxxxxxx>
- Sender: owner-mutt-users@xxxxxxxx
- User-agent: Mutt/1.5.18 (2008-05-19)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Friday, May 23 at 04:07 PM, quoth dv1445@xxxxxxxxx:
> Mutt (always has for me) made extremely inaccurate estimates of how
> big my email messages are. Often off by a factor of 5. Why is
> this? Is it maybe not mutt's fault, but the IMAP servers I interact
> with? On the one hand, I hope it's the servers' fault, because mutt
> is great. On the other hand, I hope it's mutt's fault, because then
> maybe something can eventually be done about it.
This one I don't know for sure about... but I *suspect* it might be
neither. I actually have the opposite problem; emails get bigger when
I download them. My guess is that it's a difference between logical
size and actual size. The IMAP server might be reporting how big the
message is in it's own storage (which will vary depending on the block
size of the backend---you know, the standard problem of a 1-byte file
being 4k big on disk) while mutt is reporting how many bytes of
message it actually read. There's another issue there which is the
CRLF problem. Mutt asks the IMAP server for a message's "RFC822.SIZE",
which means that all line endings are *supposed* to be CRLF's (i.e. 2
bytes), while when it actually downloads, mutt is reporting the size
of what it actually received, which may not have all the right line
endings. And it's possible that the server is under-reporting things
(i.e. it's reporting a compressed size, or a size with wrong line
endings) or over-reporting things (i.e. its reporting a file size).
But that's just a guess. The point is, because of line-ending issues
and storage issues, getting the size "right" can be surprisingly
annoying and slow, and I would imagine that many IMAP servers take
shortcuts that result in loss of accuracy (and that the inaccuracy
depends on the brand of server as well as the backend storage
mechanism used).
Chances are, mutt's doing it's best to give you the most accurate
information is has available... it just doesn't always have very
accurate information.
> Mutt also does this with attachments. Say I want to attach a PDF
> that I know for a fact is of size 125K. I attach it in the compose
> window, and immediately mutt says that it's of size, say, 193K.
> (Sometimes mutt adds only several K, usually lots). What's going
> on? Is mutt doing something to the PDF? -gmn
THIS is actually very easy to explain. When you are browsing for
files, mutt is telling you how big the file is on disk. When you've
attached a file, mutt has to encode that file in a way that is safe to
transmit in email (which has a lot of caveats). If the attachment is
text, aside from converting line-endings, that's not going to result
in much of a change in size. For binary files like PDFs, that
generally means Base64-encoding the file, a process that increases the
size of a file by approximately one-third (especially after adding all
the right CRLF line-endings and formatting it appropriately).
Ballooning from 125K to 193K sounds about right.
In many ways, this reinforces the claim that many make that email is a
*lousy* file-transfer mechanism. They're absolutely right, because to
do it safely, you have to encode your file in such a way that can be
embedded in a message originally designed for short plain text
messages... and that means your file is going to get *much* bigger,
and takes time to decode and encode.
~Kyle
- --
Coffee is the common man's gold, and like gold, it brings to every
person the feeling of luxury and nobility.
-- Sheik Abd-al-Kadir
-----BEGIN PGP SIGNATURE-----
Comment: Thank you for using encryption!
iEYEARECAAYFAkg3QPoACgkQBkIOoMqOI15NAACglL6lH+hTpx4Vts2umyvDYdJ4
P8kAoL4LuDiszjXc5EtH8ycdIFAipyCp
=4ZVd
-----END PGP SIGNATURE-----