mutt/2870: color index foo foo ~h pattern causes many fileops on resync only
>Number: 2870
>Notify-List:
>Category: mutt
>Synopsis: color index foo foo ~h pattern causes many fileops on resync
>only
>Confidential: no
>Severity: normal
>Priority: medium
>Responsible: mutt-dev
>State: open
>Keywords:
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Mar 29 17:43:53 +0200 2007
>Originator: Marc MERLIN
>Release:
>Organization:
>Environment:
Mutt 1.5.13 (2006-08-11)
Copyright (C) 1996-2006 Michael R. Elkins and others.
Mutt comes with ABSOLUTELY NO WARRANTY; for details type `mutt -vv'.
Mutt is free software, and you are welcome to redistribute it
under certain conditions; type `mutt -vv' for details.
System: Linux 2.6.14y (i686) [using ncurses 5.5] [using libidn 0.5.18 (compiled
with 0.6.5)]
Compile options:
-DOMAIN
+DEBUG
-HOMESPOOL +USE_SETGID +USE_DOTLOCK +DL_STANDALONE
+USE_FCNTL -USE_FLOCK +USE_INODESORT
+USE_POP +USE_IMAP -USE_GSS -USE_SSL_OPENSSL +USE_SSL_GNUTLS +USE_SASL
+HAVE_GETADDRINFO
+HAVE_REGCOMP -USE_GNU_REGEX
+HAVE_COLOR +HAVE_START_COLOR +HAVE_TYPEAHEAD +HAVE_BKGDSET
+HAVE_CURS_SET +HAVE_META +HAVE_RESIZETERM
+CRYPT_BACKEND_CLASSIC_PGP +CRYPT_BACKEND_CLASSIC_SMIME -CRYPT_BACKEND_GPGME
-BUFFY_SIZE -EXACT_ADDRESS -SUN_ATTACHMENT
+ENABLE_NLS -LOCALES_HACK +COMPRESSED +HAVE_WC_FUNCS +HAVE_LANGINFO_CODESET
+HAVE_LANGINFO_YESEXPR
+HAVE_ICONV -ICONV_NONTRANS +HAVE_LIBIDN +HAVE_GETSID +USE_HCACHE
-ISPELL
SENDMAIL="/usr/sbin/sendmail"
MAILPATH="/var/mail"
PKGDATADIR="/usr/share/mutt"
SYSCONFDIR="/etc"
EXECSHELL="/bin/sh"
MIXMASTER="mixmaster"
To contact the developers, please mail to <mutt-dev@xxxxxxxx>.
To report a bug, please visit http://bugs.mutt.org/.
patch-1.5.11.rr.compressed.1
patch-1.5.4.vk.pgp_verbose_mime
patch-1.5.5.1.nt.xtitles.3.ab.1
patch-1.5.6.dw.maildir-mtime.1
patch-1.5.6.tt.assumed_charset.1
>Description:
Executive summary:
I am using header_cache="~/Maildir/" and set maildir_header_cache_verify = no
Each "color index ~h" rule does not cause any stats in the Maildir/.snd/
directory when I _open_ the folder (apparently, the header cache is used
for that).
However each of my sent messages is stated and opened four times when
mutt resyncs the already opened folder, that is 4 times per ~h line that
is processed from my muttrc
Details:
I had a bunch of rules like these:
color index brightgreen black '! ~p'
color index black brightblue '~C afscv'
color index brightwhite black '~p'
Those rules are ok, mutt processes them quickly enough (a few seconds)
whether I open a folder, or mutt resyncs it.
However, rules like these are causing significant delays on auto resyncs
(i.e. when an open folder is externally modified). 12 rules like the
ones below caused a resync time of 5mn for a 40,000 message folder
color index black brightyellow '~h List-Id:.*bugtraq'
color index brightblue black '~h hits\=3'
color index brightblue brightyellow '~h post\ from'
mutt can process ~f, ~C and ~p from the header cache, but not ~h. Or at
least it can't do it on resync apparently (unless the colors are cached
in the header cache file, which makes for a fast open, but a slow resync?)
There is still a clear problem. Just adding a single:
> color index black brightyellow '~h List-Id:.*bugtraq'
causes 4 stats and opens for _each_ message (again, on resync only)
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S",
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S",
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S",
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
> open("/home/merlin/Maildir//.snd/cur/1162956827.15796_57.magic:2,S",
> O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0600, st_size=1914, ...}) = 0
If I add a second ~h line, I get 8 fstat and open calls _per_ message.
Now, that explains why I was seeing 5mn rsync times.
The unanswered questions are:
1) why can mutt use the header cache to run the ~h rules when the folder
is opened, but not when it's resynced
2) Why does adding a message to a folder cause mutt to ignore its entire
cache and rebuild a brand new one at resync time, scanning all messages
multiple times, when at folder open time, mutt looks smart enough to use
the cache for already cached messages, and only open/index/scan whatever
few messages that weren't in the cache already?
3) 4 fstat and open calls per message in the folder? That's obviously at
least 3 too many, but most likely 4 times too many.
>How-To-Repeat:
Take 40,000 message ~/Maildir/.snd folder, open from mutt, send a random email
that gets fcced to that same folder, see how long mutt takes to reopen the .snd
folder.
Repeat operation by adding 10 "color index red black ~h fooheader" rules in
muttrc
Verify fileops with strace -e trace=file mutt
>Fix:
my current workaround has been to remove all ~h color rules.
Fixes would be:
1) mutt should never open a stat the same file 4 times in a row
2) mutt resync code looks very suboptimal compared to folder open code.
If it were to only parse new messages in the folder (messages not in
the cache), it would effectively fix this problem.
>Add-To-Audit-Trail:
>Unformatted: