On Thu, Oct 23, 2003 at 05:43:03PM -0400, Allister MacLeod <amacleod@xxxxxxxx> wrote: > Ok.. I did write a small python script that uses a relatively simple > method to reduce the regex length. I have the feeling that its > effectiveness could be improved, perhaps by changing to a character by > character tree or something. Anyway, the problem is harder than > I first thought, so I'll probably just leave it as it is. :-) Okay, I thought this was an interesting problem, so I wrote my own brief perl script (attached). Incidentally, I wasn't able to get the supplied python script to run. Also, going to a character-by-character split only reduced the output by an additional 4 characters, so I didn't do it, as I feel the regex below is more readable. It does shuffle around the order of things, but that wasn't a concern for me. Output from the perl script: Original string(1308): (archivers/p5-Archive-Tar|archivers/p5-Compress-Zlib|devel/p5-AppConfig|devel/p5-Class-Factory-Util|devel/p5-Class-Singleton|devel/p5-Config-General|devel/p5-Config-Ini|devel/p5-Date-Calc|devel/p5-Date-ICal|devel/p5-Date-ISO|devel/p5-Date-Leapyear|devel/p5-Date-Manip|devel/p5-Date-Pcalc|devel/p5-DateConvert|devel/p5-DateTime|devel/p5-DateTime-Locale|devel/p5-DateTime-TimeZone|devel/p5-ExtUtils-ParseXS|devel/p5-File-Temp|devel/p5-Inline|devel/p5-Inline-CPP|devel/p5-Locale-Maketext|devel/p5-Memoize|devel/p5-Module-Build|devel/p5-Params-Validate|devel/p5-Parse-RecDescent|devel/p5-Storable|devel/p5-Test-Harness|devel/p5-Test-Inline|devel/p5-Test-Simple|devel/p5-Tie-IxHash|devel/p5-Time-HiRes|devel/p5-Time-Local|devel/p5-Time-modules|devel/p5-TimeDate|dns/p5-Net-DNS|mail/p5-Mail-SpamAssassin|mail/p5-Mail-Tools|math/p5-Bit-Vector|misc/p5-I18N-LangTags|net/p5-URI|security/p5-Digest-HMAC|security/p5-Digest-MD5|security/p5-Digest-Nilsimsa|security/p5-Digest-SHA1|textproc/p5-FreeBSD-Ports|textproc/p5-Text-Balanced|textproc/p5-Text-Template|textproc/p5-YAML|www/p5-CGI-Application|www/p5-CGI-Kwiki|www/p5-CGI-modules|www/p5-CGI-Session|www/p5-CGI.pm|www/p5-HTML-Parser|www/p5-HTML-Tagset|www/p5-HTML-Template|www/p5-HTML-Tree|www/p5-libwww|www/p5-Template-Toolkit|x11-toolkits/p5-Tk|x11/p5-X11-Protocol) Reduced string(751): (devel/p5-(Memoize|ExtUtils-ParseXS|Parse-RecDescent|Test-(Inline|Simple|Harness)|DateConvert|Class-(Singleton|Factory-Util)|Date-(Manip|Pcalc|Leapyear|ICal|Calc|ISO)|Params-Validate|Locale-Maketext|Config-(Ini|General)|AppConfig|Time-(modules|HiRes|Local)|Storable|Module-Build|Inline(|-CPP)|DateTime(|-(Locale|TimeZone))|File-Temp|Tie-IxHash|TimeDate)|x11(/p5-X11-Protocol|-toolkits/p5-Tk)|net/p5-URI|mail/p5-Mail-(Tools|SpamAssassin)|math/p5-Bit-Vector|www/p5-(Template-Toolkit|libwww|HTML-(Tree|Template|Tagset|Parser)|CGI.pm|CGI-(Application|Session|Kwiki|modules))|security/p5-Digest-(Nilsimsa|MD5|HMAC|SHA1)|misc/p5-I18N-LangTags|dns/p5-Net-DNS|archivers/p5-(Compress-Zlib|Archive-Tar)|textproc/p5-(Text-(Template|Balanced)|FreeBSD-Ports|YAML)) -- Bob Bell <bbell@xxxxxxxxxxxxxxxxxxxxx>
Attachment:
reduce.pl
Description: Perl program