<<< Date Index >>>     <<< Thread Index >>>

Re: The future of mailboxes?



Le septidi 27 thermidor, an CCXIV, Derek Martin a écrit :
> Agreed, though from the perspective of maximizing performance, the
> latter is definitely the way to go.  One point: if support for this
> were written using good software engineering principles (modular
> code), it could easily be provided as a separate library which other
> mail clients could choose to adopt (or not).  I think they would
> adopt, if people started using it.  I always hated the idea that
> Outlook does this, because it doesn't play nice with others...  but if
> there were a standard way to do it which were well implemented and
> freely available (i.e.  GPL or some such license), that would kick
> ass.  As pointed out already though, Mutt doesn't exactly have a
> modular mailbox driver API, so integrating that into mutt might be
> difficult.  Probably still worth doing right (if done at all), and
> might even pave the way for Mutt's mailbox code to get cleaned up.

I absolutely agree with that.

But it is probably easier to get attention from the mutt team than from,
say, the Thunderbird team. And it is definitely easier to peek in the code
and try to do something -- although I am not yet at this level.

> Also, note that the virtual folder aspects of this can be implemented
> using either mbox or maildir plus caching/indexing, though you don't
> get the same performance benefits regarding the other operations.  You
> don't even need to keep all the messages in one folder; you just need
> to keep a cache/index of where the messages are.  Complex?  Yeah, but
> absolutely worth doing, I think, if people want to stay with
> mail-in-files folders.  In addition to making virtual folders
> possible, it will also make things generally faster (because headers
> and such will necessarily be cached and indexed, for all supported
> folder formats).

Again, I absolutely agree.


Le septidi 27 thermidor, an CCXIV, Seth Arnold a écrit :
> While SQLite is some amazing code, it really requires a single process
> to manage the coherency of the system. I would like to continue to use
> e.g. procmail to "deliver" my mail -- marking some messages already
> read before delivering them (duplicates, uninteresting automatically
> generated messages that I occasionally search through but never _read_,
> etc), classifying nonspam, possible spam, definitely spam.
> 
> Because I want both mutt _and_ procmail to be able to modify the mail
> storage, I believe SQLite is the wrong approach; you would necessarily
> wind up building something rather like postgresql's postmaster, or mysqld,
> etc. Both mutt and your mail delivery system (procmail, postfix's 'local'
> agent, exim's local channels, etc) would need to communicate with the
> master mail database process.

That several tools can handle the mail storage is of course an absolute
prerequisite. But I believe you are making a wrong assumption here: that
only one program can touch a SQLite database. SQLite behave perfectly when
several programs try to use the same database file, that is the very first
thing I checked when I tried SQLite. Changes made in one process are
immediately visible in another, and transactions block other conflicting
operations.

> I expected that I would have several tables:
<snip>

That is quite the straightforward answer, when one does not forget that a
mail can have several From and To addresses, which I did.

The hard part is, I think, if the mail data itself is wanted in files, for
any of the reasons given in the thread. The database must hold a pointer to
the data, which can have different formats depending on the underlying
storage. Furthermore, the database must have some way of detecting that the
underlying has changed and must be re-parsed. Last of all, there is a need
for rules to decide in which underlying storage must go a new mail or a
modified mail.

Storing everything in the database is so much simpler :-/


Le sextidi 26 thermidor, an CCXIV, Kyle Wheeler a écrit :
> My understanding is that mutt is, unfortunately, not modularized like 
> that. The integration with things like IMAP support and mbox/maildir 
> support is fairly complex and not abstracted out.

That is also what I understood while looking in the code. Is it among mutt
team's plans to rework it in a more modular way?

> For a "bad thing" it also happens to be extremely popular.

Sometimes, the performance issue makes it absolutely necessary to use a more
complex / less reliable solution.

> What helps is the designation that one is a *cache* and the other is 
> *authoritative*.

That is true, but not much. What is made easier is recovery: if data is out
of sync, we know which one is correct. But working with a cache in front of
a badly designed authoritative data is always more painful than working with
a well designed authoritative data.

> Additionally, if you're presuming that one can only ever access the 
> cache with a client that supports it (e.g. the mail client you wish to 
> design), you can easily ensure that the cache is *never* out of sync 
> with the main mail storage.

Until the first bug is found.

> The mbox format itself does not require a content-length field (it is 
> merely a convenience), and thus for a general-purpose mail reader, you 
> must assume that you NEVER have a reliable content-length field: it is 
> an optimization, nothing more.

Except if the mail reader is the only one who writes in that mbox, and it
puts a Content-Length field (and it is bugless).

> The implementation is irrelevant, what matters is that the kernel does 
> it all: I say "kernel, send this file" and the kernel goes and does it 
> and tells me when it's done.

Your faith in the kernel is touching, but it is not really conclusive: the
kernel is nothing more than a big bunch of code, and knowing how it does its
work is absolutely necessary for fine predictive profiling.

> Well, good for them. But we're talking about storing mail, which 
> filesystems do very well.

For servers perhaps. For mail user agent, that is not true, that is the
whole point of this thread and numerous others, and all the dirty hacks with
forests of symlinks.


Regards,

-- 
  Nicolas George

Attachment: signature.asc
Description: Digital signature