<<< Date Index >>>     <<< Thread Index >>>

Re: color index foo foo ~h pattern causes many fileops on rsync only



[executive summary: each color index ~h rule does not cause any stats in the
Maildir/.snd/ directory when I _open_ the folder (apparently, the header cache
is used for that). However each of my sent messages is stated and opened four
times when mutt resyncs the already opened folder, that is 4 times per ~h line
that is processed]

On Wed, Mar 28, 2007 at 09:56:55AM +0200, Thomas Roessler wrote:
> The function is called mutt_addr_is_user, btw, and the debug message
> just got fixed.
> 
> The only thing that I can think of that would apply
> mail_addr_is_user that often would be the scoring code, depending on
> what kind of rules you've got in your config there.

Thanks for the hint, that was close enough to let me find the problem.

I had a bunch of rules like these:
color index     brightgreen     black           '! ~p'
color index     black           brightblue      '~C afscv'
color index     brightwhite     black           '~p'

Those rules are ok, mutt processes them quickly enough (a few seconds).


In hindsight, it's a bit obvious now, but the rules in question that
caused the dramatic slowdown, are lines like this (I had 12)
color index    black           brightyellow    '~h List-Id:.*bugtraq'
color index    brightblue      black           '~h hits\=3'
color index    brightblue      brightyellow    '~h post\ from'

mutt can process ~f, ~C and ~p from the header cache, but not ~h. Or at
least it can't do it on resync apparently (unless the colors are cached
in the header cache file, which makes for a fast open, but a slow resync?)

There is still a clear problem. Just adding a single:
> color index    black           brightyellow    '~h List-Id:.*bugtraq'

causes 4 stats and opens for _each_ message:
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S", 
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S", 
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S", 
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S", 
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0

If I add a second ~h line, I get 8 fstat and open calls _per_ message.
Now, that explains why I was seeing 5mn rsync times.

The unanswered questions are:
1) why can mutt use the header cache to run the ~h rules when the folder
   is opened, but not when it's resynced

2) 4 fstat and open calls per message in the folder? That's obviously at 
   least 3 too many, but most likely 4 times too many.

3) Why does adding a message to a folder cause mutt to ignore its entire
   cache and rebuild a brand new one at resync time, scanning all messages
   multiple times, when at folder open time, mutt looks smart enough to use
   the cache for already cached messages, and only open/index/scan whatever
   few messages that weren't in the cache already?

Ok, we made some headway. Is that enough to get the code fixed? :)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems & security ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/