On Mon, Aug 14, 2006 at 12:20:47AM +0100, Paul Walker wrote: > On Sun, Aug 13, 2006 at 05:39:14PM +0200, Nicolas George wrote: > > > Thus, if part of your mail is in a database, then having all your mail in > > the database is a sensible and straightforward choice. > > This depends on which viewpoint you're coming from - basically, whether the > stuff in the database is an optimisation which lets you do certain things > [faster], or the stuff in the database *is* the mail. > > From the point of view of interoperating with, well, anything else, the > first option would seem more sensible. Agreed, though from the perspective of maximizing performance, the latter is definitely the way to go. One point: if support for this were written using good software engineering principles (modular code), it could easily be provided as a separate library which other mail clients could choose to adopt (or not). I think they would adopt, if people started using it. I always hated the idea that Outlook does this, because it doesn't play nice with others... but if there were a standard way to do it which were well implemented and freely available (i.e. GPL or some such license), that would kick ass. As pointed out already though, Mutt doesn't exactly have a modular mailbox driver API, so integrating that into mutt might be difficult. Probably still worth doing right (if done at all), and might even pave the way for Mutt's mailbox code to get cleaned up. Storing mail in files is fine, but a database definitely can speed up a lot of operations. For example: People always point to maildir being faster to delete (expunge) a message from a folder than mbox, which is true... FOR A SINGLE MESSAGE. But the reverse becomes true if you are deleting a large number of messages -- a fact not often mentioned by the maildir people. This is because there will be fewer messages for mbox to rewrite, but all those unlink operations will require a lot of overhead. I routinely see the effects of this at work when I return from my weekend, where I will often have >1000 messages which I don't care about in a particular folder, and deleting them all using maildir takes forever. The same operation in an mbox folder is quite fast by comparison. But if you're using a database, it's always fast, no matter how many messages you are deleting. You just remove a record from a table, essentially, and let the database worry about cleaning up the data. Likewise where maildir is quite slow reading message indexes for very large folders (sans caching) mbox performs much better (but still slow, though caching improves the performance of both), but since you'll be putting all of the common headers into their own fields in some table, which you can index, displaying the message index will stay fast (compared to any file-based format), no matter how many messages you have, as will searching on message headers. This is a big win for the user. You'll need to use a database backend which supports regular expression searching though, if you want to maintain the power of Mutt. And, as previously pointed out, this makes implementing virtual folders insanely easy. That's a Good Thing. > (This is starting from the assumption that sticking this metadata in > databases is a good plan, which I'm unconvinced of.) There are performance benefits, but it doesn't compare to using a database wholesale. Besides, isn't that what hcache already does? Also, note that the virtual folder aspects of this can be implemented using either mbox or maildir plus caching/indexing, though you don't get the same performance benefits regarding the other operations. You don't even need to keep all the messages in one folder; you just need to keep a cache/index of where the messages are. Complex? Yeah, but absolutely worth doing, I think, if people want to stay with mail-in-files folders. In addition to making virtual folders possible, it will also make things generally faster (because headers and such will necessarily be cached and indexed, for all supported folder formats). -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail. Sorry for the inconvenience. Thank the spammers.
Attachment:
pgpyUdM0szALx.pgp
Description: PGP signature