<<< Date Index >>>     <<< Thread Index >>>

Re: mutt - slow mbox'es



* David Yitzchak Cohen <lists+mutt_users@xxxxxxxxxxxxxx> [2004-07-22 22:49 
-0400]:
> On Thu, Jul 22, 2004 at 02:27:49PM EDT, Nicolas Rachinsky wrote:
> > After removing cache the same with another Mailbox containing now
> > 68244 emails with a total size of 300MB.
> 
> Okay, now you've got me beat for mail compactness.  Let's see what happens:
> 
> > /usr/bin/time -h -l mutt -F /dev/null -f small.mbox -e 'push x'
> >         25.82s real             14.87s user             1.81s sys
> >       4859  block input operations
> >          0  block output operations
> 
> Okay, my previous assertion that mbox load time is mostly proportional to
> mailbox size is clearly totally bogus.  I'm guessing the Content-Length
> header is probably to blame.  If you still have the small.mbox lying
> around, I'd be awfully interested in the results of running the tests
> on the results after a "egrep -v '^Content-[Ll]ength:'" :-)
> 
> > /usr/bin/time -h -l mutt -F /dev/null -f small.mbox -e 'push x'
> >         16.91s real             14.63s user             1.16s sys
> >          0  block input operations
> >          0  block output operations
> 
> Well, in case you forgot, you have 1GB of RAM.  You can guess where
> every last bit of that mbox went after the first time you fetched it ;-)


With Content-Length removed:


/usr/bin/time -h -l mutt -F /dev/null -f small2 -e 'push x'
        24.91s real             15.52s user             1.78s sys
      4828  block input operations
         0  block output operations


/usr/bin/time -h -l mutt -F /dev/null -f small2 -e 'push x'
        17.71s real             15.39s user             1.11s sys
         0  block input operations
         0  block output operations


Not such a big difference. Not what I expected.


> > /usr/bin/time -h -l mutt -F /dev/null -f small.maildir -e 'push x'
> >         39.63s real             8.73s user              7.21s sys
> >      68963  block input operations
> >          0  block output operations
> 
> It's impressive that maildir is able to keep from falling below 50% of
> the mbox speed even with >64K mails in a directory (You're on reiserfs,
> I assume?).

No. There is no reiserfs for FreeBSD. It's just UFS with linear lists
as directories.

> Note, though, the syscall time there.  We're talking about
> massive amounts of work being done by your kernel, servicing about
> 17 times as many read(2)s as the mbox.  Also worth noting is the time
> spent idling while waiting on the disk.  The mbox waits for only about
> 9 seconds, while maildir winds up waiting for almost 24!  Clearly, you
> wouldn't want several users banging away on maildirs at the same time
> on your system. . .

Don't mistake me with the 'maildir ist the best' proponents. I'm using
maildirs for my incoming folders (*) and mbox for most archive
boxes (**).

(*) here are maildirs IMHO more reliable and they are -- even without
the header cache -- clearly faster, not neccessarily while opening
but for marking/deleting/... mails.

(**) I didn't use them for archive boxes with relativly big mails, but
for the other boxes are mboxes faster -- without cache -- and they need less
diskspace -- even without a header cache. And very old boxes can be
easily compressed.


> > /usr/bin/time -h -l mutt -F /dev/null -f small.maildir -e 'push x'
> >         36.66s real             8.61s user              6.52s sys
> >      68253  block input operations
> >          0  block output operations
> 
> Now, isn't that interesting?  1GB of RAM was barely able to cache anything
> between your opens.  (Maybe a cron job or something happened between the
> two opens?  I wasn't expecting the second open to be _that_ bad. . .)
> Assuming it's not a fluke, somebody on your FS team needs to be blamed:

It's no fluke.

> it looks to me like the only thing that was still cached was the directory
> index itself (and maybe the permission structs in the inode tables).
> With a whole gigabyte of RAM, I think you have the right to expect better.
> The system was waiting for a whopping 21 seconds of disk time, to access
> data that could've easily been cached if Linux only cared enough to :-(

It's FreeBSD not Linux, but you're right, it could have been better.
Are there any FreeBSD FS gurus reading this? :)

> > /usr/bin/time -h -l mutt -F /dev/null -f small.maildir -e 'set 
> > maildir_cache=cache' -e 'unset maildir_cache_verify' -e 'push x'
> >         3.87s real              2.35s user              0.84s sys
> >          1  block input operations
> >         12  block output operations
> 
> I'm not sure what's happening there, exactly.

'unset maildir_cache_verify' should leaave out one stat per message,
so it's a bit faster -- as expected.


> > BTW:
> > 136M    cache
> 
> When was that number obtained?  I'd expect buffers+cache to be about
> 300MB after the first mbox read.

Ah, this was misleading. The cache file after reading small.maildir
with the headercache was 136MB big. With a page size of 16k, instead
of the default 2k, mutt is a bit faster and the file is only 52MB big.


> > And now a third folder, containing 18 emails with a total size of 310MB.
> 
> Our prediction here is obviously that maildir will trounce mbox, even
> without the headercache.
> 
> > /usr/bin/time -h -l mutt -F /dev/null -f monster.mbox -e 'push x'
> >         0.40s real              0.01s user              0.01s sys
> >         47  block input operations
> >          0  block output operations
> 
> There's no way on Earth that your system can read 310MB off the disk in
> less than 0.4 seconds!

Correct.

Nicolas