Re: mutt - slow mbox'es
* David Yitzchak Cohen <lists+mutt_users@xxxxxxxxxxxxxx> [2004-07-22 22:49
-0400]:
> On Thu, Jul 22, 2004 at 02:27:49PM EDT, Nicolas Rachinsky wrote:
> > After removing cache the same with another Mailbox containing now
> > 68244 emails with a total size of 300MB.
>
> Okay, now you've got me beat for mail compactness. Let's see what happens:
>
> > /usr/bin/time -h -l mutt -F /dev/null -f small.mbox -e 'push x'
> > 25.82s real 14.87s user 1.81s sys
> > 4859 block input operations
> > 0 block output operations
>
> Okay, my previous assertion that mbox load time is mostly proportional to
> mailbox size is clearly totally bogus. I'm guessing the Content-Length
> header is probably to blame. If you still have the small.mbox lying
> around, I'd be awfully interested in the results of running the tests
> on the results after a "egrep -v '^Content-[Ll]ength:'" :-)
>
> > /usr/bin/time -h -l mutt -F /dev/null -f small.mbox -e 'push x'
> > 16.91s real 14.63s user 1.16s sys
> > 0 block input operations
> > 0 block output operations
>
> Well, in case you forgot, you have 1GB of RAM. You can guess where
> every last bit of that mbox went after the first time you fetched it ;-)
With Content-Length removed:
/usr/bin/time -h -l mutt -F /dev/null -f small2 -e 'push x'
24.91s real 15.52s user 1.78s sys
4828 block input operations
0 block output operations
/usr/bin/time -h -l mutt -F /dev/null -f small2 -e 'push x'
17.71s real 15.39s user 1.11s sys
0 block input operations
0 block output operations
Not such a big difference. Not what I expected.
> > /usr/bin/time -h -l mutt -F /dev/null -f small.maildir -e 'push x'
> > 39.63s real 8.73s user 7.21s sys
> > 68963 block input operations
> > 0 block output operations
>
> It's impressive that maildir is able to keep from falling below 50% of
> the mbox speed even with >64K mails in a directory (You're on reiserfs,
> I assume?).
No. There is no reiserfs for FreeBSD. It's just UFS with linear lists
as directories.
> Note, though, the syscall time there. We're talking about
> massive amounts of work being done by your kernel, servicing about
> 17 times as many read(2)s as the mbox. Also worth noting is the time
> spent idling while waiting on the disk. The mbox waits for only about
> 9 seconds, while maildir winds up waiting for almost 24! Clearly, you
> wouldn't want several users banging away on maildirs at the same time
> on your system. . .
Don't mistake me with the 'maildir ist the best' proponents. I'm using
maildirs for my incoming folders (*) and mbox for most archive
boxes (**).
(*) here are maildirs IMHO more reliable and they are -- even without
the header cache -- clearly faster, not neccessarily while opening
but for marking/deleting/... mails.
(**) I didn't use them for archive boxes with relativly big mails, but
for the other boxes are mboxes faster -- without cache -- and they need less
diskspace -- even without a header cache. And very old boxes can be
easily compressed.
> > /usr/bin/time -h -l mutt -F /dev/null -f small.maildir -e 'push x'
> > 36.66s real 8.61s user 6.52s sys
> > 68253 block input operations
> > 0 block output operations
>
> Now, isn't that interesting? 1GB of RAM was barely able to cache anything
> between your opens. (Maybe a cron job or something happened between the
> two opens? I wasn't expecting the second open to be _that_ bad. . .)
> Assuming it's not a fluke, somebody on your FS team needs to be blamed:
It's no fluke.
> it looks to me like the only thing that was still cached was the directory
> index itself (and maybe the permission structs in the inode tables).
> With a whole gigabyte of RAM, I think you have the right to expect better.
> The system was waiting for a whopping 21 seconds of disk time, to access
> data that could've easily been cached if Linux only cared enough to :-(
It's FreeBSD not Linux, but you're right, it could have been better.
Are there any FreeBSD FS gurus reading this? :)
> > /usr/bin/time -h -l mutt -F /dev/null -f small.maildir -e 'set
> > maildir_cache=cache' -e 'unset maildir_cache_verify' -e 'push x'
> > 3.87s real 2.35s user 0.84s sys
> > 1 block input operations
> > 12 block output operations
>
> I'm not sure what's happening there, exactly.
'unset maildir_cache_verify' should leaave out one stat per message,
so it's a bit faster -- as expected.
> > BTW:
> > 136M cache
>
> When was that number obtained? I'd expect buffers+cache to be about
> 300MB after the first mbox read.
Ah, this was misleading. The cache file after reading small.maildir
with the headercache was 136MB big. With a page size of 16k, instead
of the default 2k, mutt is a bit faster and the file is only 52MB big.
> > And now a third folder, containing 18 emails with a total size of 310MB.
>
> Our prediction here is obviously that maildir will trounce mbox, even
> without the headercache.
>
> > /usr/bin/time -h -l mutt -F /dev/null -f monster.mbox -e 'push x'
> > 0.40s real 0.01s user 0.01s sys
> > 47 block input operations
> > 0 block output operations
>
> There's no way on Earth that your system can read 310MB off the disk in
> less than 0.4 seconds!
Correct.
Nicolas