<<< Date Index >>>     <<< Thread Index >>>

Re: mutt - slow mbox'es



> Hi,

Hey, Thomas :-)

On Thu, Jul 22, 2004 at 11:20:14AM EDT, Thomas Glanzmann wrote:

> > Why do you need md5 on the entire message?  Doesn't a header-only md5
> > suffice?
> 
> because the length of the body is also stored in the header. So if you
> modify the body this length wouldn't be updated and mutt keeps crashing.

Hmm ... okay :-(

> > With maildir, you start out far worse with nothing, so you have far
> > more to gain by caching headers.  (Try a 300MB maildir with no
> > headercache, and I think you'd sooner die than wait for it to load ...
> > hehe. . .)

> I already did that, and decided to write a header cache before I die.

LOL :-)

> The size doesn't count in that case, but the number of messages.

That's not strictly true in general, just in most native UNIX filesystems.

> The performance aspect isn't that clear to people who have a fast PC with
> fast disk IO.

Perhaps not surprisingly, the most performance improvement comes from
fast disk bandwidth, and the second-most from buffers in RAM (i.e.,
from a freakin' gigabyte of RAM!).

> My PC at home opens a uncached maildir and mbox message
> both in ~ 3 seconds.

How many messages are we talking about, and what size box?  Remember:
mbox varies mostly by total size, while maildir varies mostly by number
of messages.  (Look at some extreme cases: a box with 5 2GB messages will
be _much_ faster with maildir, while a box with 2 billion 5-byte messages
(which aren't really possible, but whatever) would take far less time with
an mbox, and wouldn't even be possible for a maildir on many filesystems.)

> However where I have to read my eMails at the
> moment at a SunFire280R it's a big improvement to have the maildir
> header cache. I am talking about 25 seconds vs. 4 seconds in worst case
> scenarios for opening 30.000 messages within in a 90 MByte
> mailbox/maildir.

It shouldn't take 25 seconds to open a 90MB mbox.

> > If you're serious about scalable performance, I'd suggest a real database
> > server (like mysql) for the mail store.  Trying to emulate a database
> > with the filesystem isn't really The Right Way (TM).
> 
> The day I start using mysql to store my eMail is the day I shoot me.

I don't see what you have against it.  It's a specialized tool for storing
data and indexing it rapidly.  It seems like the perfect "format" for
a mailbox, if you ask me.

> > You just got me curious about an interesting possible user-mode
> > filesystem, actually: How about a filesystem server providing a virtual
> > maildir from an mbx back-end?  Network clients can use the virtual maildir
> > over NFS about as safely as a real maildir.  Local clients can use the
> > mbx back-end directly, or the virtual maildir, if it's more convenient
> > (say, for the mairix symlink trick).
> 
> hmmm. Just add nfs-safe locking to mbx?

How exactly do you propose doing that?

Besides, one of the advantages of mbx over mbox is that concurrent access
is often possible without fancy tricks (like what Mutt does) that aren't
guaranteed to work.  In mbx, any given pair/triplet/etc. of operations
is either guaranteed to work (and therefore supported), or unsafe (and
therefore not supported).  As soon as you use traditional locking, you
lose that performance edge.  If you're going to work over NFS, I think
maildir is basically the only way to go (especially since NFS is strong
in exactly the areas where maildir demands a filesystem to be strong
(namely directory search and large sequential block reads), so your
performance penalty for NFS probably won't be as bad as with mbox).

 - Dave

-- 
Uncle Cosmo, why do they call this a word processor?
It's simple, Skyler.  You've seen what food processors do to food, right?

Please visit this link:
http://rotter.net/israel

Attachment: pgpUPUjY4CXDT.pgp
Description: PGP signature