I'm once again full-text-indexing all my email, and it's working great. This has only limited relation to mutt; I use mutt, but most any MUA could be plugged in place of it. In the rest of this message I'll describe what I'm doing and how. Last time I did full-text indexing, I used glimpse; but that codebase has gone down a road I didn't want to follow (restricted-use commercial code). For a long time I did without. Recently I took another look over the available full-text indexers; Freshmeat gave me: Lupy, glimpse, harvest, holmes, namazu, swish, swish++, and yase. The first one I looked at closely was swish++, and I stopped there. Perhaps some of the others would have worked better; I didn't check. Given swish++, the whole job was almost perfectly trivial. I find that on my platform, the memory scaling of index is somewhat different from the author's; on Red Hat 8 I found -W100000 climbed to 76MB before finishing my email archives, where the author was seeing 64MB per 250Kwords. Aside from that everything is slick. My email I archive in Maildirs; that's critical to this strategy. I built the initial index, didn't take all that long, and I incrementally re-index periodically and that's _really_ quick; I re-index with: #!/bin/sh -e cd $HOME/archive/Mail find */??? -type f -newer swish++.index | index -W100000 -I - mv swish++.index.new swish++.index With a current index, I can do keyword searches for email with the attached perl script; invoked with keywords (actually, search takes boolean relations of keywords) it will build a tmp maildir populated with links to the matching messages, and invoke mutt on it. Very, very fast. -Bennett
#!/usr/bin/perl -w
use strict;
use IO::File;
use File::Basename;
my $nothing = <<'EoF';
Lucy Locket lost her pocket; Kitty Fisher found it.
Nothing in it, nothing in it, but the binding 'round it.
EoF
my $tmpbox = $ENV{HOME} . '/.mailsearch' . $$;
END { exec "rm", "-rf", $tmpbox; }
mkdir $tmpbox, 0700 or die;
mkdir "$tmpbox/$_" or die for qw(tmp new cur);
my $cur = "$tmpbox/cur";
chdir $ENV{HOME} . '/archive/Mail' or die;
my $gotsome = 0;
my $fi = IO::File->new("search @ARGV|") or die;
while (defined($_ = $fi->getline)) {
next if /^#/;
my $fn = (split)[1];
link $fn, "$cur/@{[basename($fn)]}" or die;
$gotsome = 1;
}
die $nothing unless $gotsome;
system "mutt", "-f", $tmpbox;
Attachment:
pgpDYXHDknjKL.pgp
Description: PGP signature