<<< Date Index >>>     <<< Thread Index >>>

Re: When to break threads (was: alternative threading algorithms for sloppy mailing list)



On Sat, Sep 06, 2003 at 04:29:25PM -0400, Daniel E. Eisenbud 
<eisenbud@xxxxxxxxxxxxxx> wrote:
> Maybe a decent heuristic, which might be pretty easy to implement, would
> be to keep the thread together if one subject is a substring of the
> other, or if they have at least a certain number of characters in
> common?  Of course, varying levels or "Re:" can be automatically taken
> care of by real_subj as determined by reply_regexp, but a sensible
> heuristic should also deal with whitespace damage within the subject --
> it's fairly common.

Furthermore, if this heuristic had few enough false positives, it could
also be used to enahnce the therading by subject (aka pseudo-threading.)

-Daniel

-- 
Daniel E. Eisenbud
eisenbud@xxxxxxxxxxxxxx
Computational Biology Center
Memorial Sloan-Kettering Cancer Center