Re: The future of mailboxes?

To: Mutt Dev <mutt-dev@xxxxxxxx>
Subject: Re: The future of mailboxes?
From: Kyle Wheeler <kyle-mutt-dev@xxxxxxxxxxxxxx>
Date: Sun, 13 Aug 2006 17:33:05 -0400
Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=memoryhole.net; b=JWOk3NUPEkjwGlvElFrmzOmN1zXKScnkBdDSV22yzRxCaL04JNC5yD7G7r5AO4cpw5+i1kR+YA9yj2wP5E9rQR+xNpjsejlcg7lo7s/23V++WH3+PPmnwlBCWJBnbmYX1ozYzFP7tnte2Rw8PKk9qFqrqF+/PWZgcDt8GnzZjC0= ;
In-reply-to: <20060813153914.GA7645@xxxxxxxxxxxxxx>
List-unsubscribe: <mailto:mutt-dev-request@mutt.org?Subject="Unsubscribe Mutt Dev"?body=unsubscribe>
Mail-followup-to: Mutt Dev <mutt-dev@xxxxxxxx>
References: <20060809175242.GA23619@xxxxxxxxxxxxxx> <20060809235925.GV2584@xxxxxxx> <2c2167360608100133v5633517co7113c07731ae4be6@xxxxxxxxxxxxxx> <20060809175242.GA23619@xxxxxxxxxxxxxx> <20060809184756.GA83022@xxxxxxxxxxxxxxxx> <20060810150143.GA6753@xxxxxxxxxxxxxx> <20060811202249.GU21816@xxxxxxxxxxxxxxxxx> <20060812125357.GA99@xxxxxxxxxxxxxx> <20060812231227.GC17663@xxxxxxxxxxx> <20060813153914.GA7645@xxxxxxxxxxxxxx>
Sender: owner-mutt-dev@xxxxxxxx
User-agent: Mutt/1.5.12 (2006-08-10)

I thought you wanted to keep these responses off-list. If not, fine.

On Sunday, August 13 at 05:39 PM, quoth Nicolas George:

By the way, are there technical notes somewhere about what API it isnecessary to implement when writing a new storage format for mutt?

My understanding is that mutt is, unfortunately, not modularized likethat. The integration with things like IMAP support and mbox/maildirsupport is fairly complex and not abstracted out.

It is widely admitted that keeping the same information twice in twodifferent formats is a bad thing, because it leads, sooner or later,to copies that are out of sync, and therefore needs synchronisationtools and so on.

For a "bad thing" it also happens to be extremely popular. Virtuallyevery modern computer has, for example, not one but two levels ofcache between CPU and main memory. Some have three, and I believe thePentium 4 hides a semi-secret fourth level. There are, indeed, rathercomplex cache coherency algorithms, and it's something that hardwaredesigners fret about a lot, but it's something worth doing in mostcases.

What helps is the designation that one is a *cache* and the other is*authoritative*. This is different from what they taught you indatabase design class, where data duplication was bad becauseeverything is equally authoritative.

Additionally, if you're presuming that one can only ever access thecache with a client that supports it (e.g. the mail client you wish todesign), you can easily ensure that the cache is *never* out of syncwith the main mail storage.

The only point where several files may behave faster if there is alot of huge attachments, and the task is to read only the header.And that is only true if the mbox file has no reliablecontent-length field.

The mbox format itself does not require a content-length field (it ismerely a convenience), and thus for a general-purpose mail reader, youmust assume that you NEVER have a reliable content-length field: it isan optimization, nothing more.

Furthermore, if the kernel itself is doing the reading, he hasabsolutely all needed information. In fact, if you read thedocumentation for, let us say, the Linux sendfile function, "in_fd,must correspond to a file which supports mmap()-like operations",which strongly suggests that sendfile is nothing more mmap+writebundled together.

The implementation is irrelevant, what matters is that the kernel doesit all: I say "kernel, send this file" and the kernel goes and does itand tells me when it's done. The kernel could be doing a tightread+write loop for all I care, but the real reason it wants filedescriptors that support mmap()-like operations is that it wants to beable to have any and all of the data necessary for sending at once, sothat it can optimize its sending strategy, and also so that it canseek backward in the file to handle retransmissions. The kernel,knowing where all of the pieces of that file are, can then simply setup a DMA transfer from the disk to the NIC.

In theory, yes, but in practice, no, they do not have exactly thesame consequences on databases and filesystems: databases tend toavoid writing to disk whenever possible, while filesystems arefrequently designed for reliability when (say) the power goes out.
Do you have reasons to think that database designer are stupid or do notcare about reliability?

I've had far too many databases corrupt themselves into oblivionbecause the power went out. I've had filesystems get upset because thepower went out, but nothing I couldn't fix with a good fsck; worstcase: I lost a couple files.

If database designers are not stupid and care dearly aboutreliability, tell me, why is MyISAM still the default database formatfor MySQL? It supports none of the reliability features of InnoDB,BerkeleyDB, and Gemini, nor does it support transactions. Is thatsupposed to make me supremely confident that they have data integrityin the face of unexpected failure as their number one priority? No, ofcourse not: that tells me that they have SPEED as their number onepriority.

For tables which are just linear storage of blobs of data, theyadopt a very filesystem-like data structure, and achieve the sameperformance as filesystems.

MySL stores BLOBs in tables of 2000-byte rows. Pretending for a momentthat this is essentially equivalent to a block on disk, where did thatnumber come from? It's not a convenient power of two, it has norelation to the 4k, 8k, or 16k block sizes of most filesystems, northe 4k page size of most memory systems. Maybe if you somehowconvinced MySQL to format up its own partition, you could argue thatit could make the low-level block size 2000 bytes, but who are wekidding here? Assuming that the BLOB (let's say, a 24k mail messagelike the one you just sent) got inserted at the same time as someother BLOB did, those rows may be interleaved (or MySQL may choose tointerleave them for some other reason). Reading linearly through themwill be slower from MySQL's database than from a filesystem, becauseyou've got to map in twice as many pages from disk---and, of course,it has to go through the filesystem anyway, because the database isjust a file on disk. You've got twice as much overhead, in the *best*case, because the database has to figure out where all the parts ofthe blob are within your file, and the OS has to figure out where allthe parts of the file are too. Given the 2000-byte block size (whichdoesn't vary depending on the size of the database, like block sizesin filesystems do), I'm not exactly supremely confident that MySQL isgoing to structure it's BLOB storage tables to optimize the necessaryjumping around in the file to minimize the jumping around in thefilesystem it's implemented on top of.

For tables with more cross-references, they adopt a more complexdata structure; but in that case, filesystems just could not havedone the job.

Well, good for them. But we're talking about storing mail, whichfilesystems do very well.


~Kyle
--
Life is too important to be taken seriously.
                                                       -- Oscar Wilde

Attachment: pgpqdFsFAByO9.pgp
Description: PGP signature

Follow-Ups:
- Re: The future of mailboxes?
  - From: Nicolas George
- Re: The future of mailboxes?
  - From: Bardur Arantsson

References:
- The future of mailboxes?
  - From: Nicolas George
- Re: The future of mailboxes?
  - From: Seth Arnold
- Re: The future of mailboxes?
  - From: Dave
- Re: The future of mailboxes?
  - From: Nicolas Rachinsky
- Re: The future of mailboxes?
  - From: Nicolas George
- Re: The future of mailboxes?
  - From: Chris
- Re: The future of mailboxes?
  - From: Nicolas George
- Re: The future of mailboxes?
  - From: Nicolas George

Prev by Date: [2006-08-13] CVS repository changes
Next by Date: Re: Implementation of virtual folders in mutt
Previous by thread: Re: The future of mailboxes?
Next by thread: Re: The future of mailboxes?
Index(es):
- Date
- Thread