<<< Date Index >>>     <<< Thread Index >>>

Re: fadvise WILLNEED for tokyocabinet header cache



On Sun, Jun 21, 2009 at 11:05:28PM +0200, Rocco Rutte wrote:
> I think making it faster is always good. However, there're some more 

eheh me too ;)

> points that would have to be solved to get the patch ready for 
> inclusion. For example test if posix_fadvise() is available, why only 

That requires a configure knob?

> do that for tokyocabinet, etc. Note that we're in a feature-freeze to 

No good reason to do it only for tokyocabinet indeed. But if one wants
performance he should use something with compression so that one reads
1/5th of the data at 3 times the speed with fadvise than without
fadvise. Cumulatively compression and fadvise reduce the opening time
tenfold, ignoring the time it actually takes to process the data by
the db lib after the I/O is complete. I agree we should add fadvise to
all backends even if it makes little sense to use anything but
tokyocabinet.

I can try to look into preparing a more suitable and complete patch
than the proof of concept hack that I'm using right now next week. If
somebody wants to takeover this task you're welcome ;).

> It's there because mutt displays what the mailbox has, not what the 
> cache has. The cur/ directory may change greatly between to mutt 
> sessions (e.g. a sync using unison, rsync, etc). I can't imagine 
> dropping it (except for a replacement mechanism that tells us what files 
> there are).

What if I can guarantee you that I don't modify the 'cur' directory
under 'mutt'. The only thing that touches my maildir are procmail in
'tmp' and 'new' and mutt in 'cur' and 'new'. Nothing else ever touches
my maildir, so I definitely need a knob to disable readdir on 'cur'
and to only have readdir on 'new' when opening a maildir so with the
addition of fadvise and tokyocabinet compression things will fly even
for large folders with 150k mails ;). If I can remove readdir opening
a 150k folder on cold cache after boot, with fadvise and tokyocabinet
(55M hcache file) will be only 2 seconds slower than opening it from
totally hot filesystem caches (after those 2 seconds of I/O inside
fadvise syscall reading the hcache file at >30M/sec the all wait time
will be spent at 100% cpu load). On small folders (<10k emails) with
fadvise and compression things are now almost immediate even keeping
readdir with cold cache too. My hd on the laptop is very slow, on real
fast harddisk the hcache will be loaded at 100M/sec with fadvise, so
then it almost won't be noticeable to open a 150k email folder on hot
or cold kernel filesystem caches if we've a knob to disable readdir on
'cur'. (this assuming 'new' has no lots of mails but that's fine if it
slowdown a bit if you don't open it for a while ;)