<<< Date Index >>>     <<< Thread Index >>>

Re: sort-mailbox by spam tag score sorting strangeness



The attached patch resolves the problem presented below in mutt-users.
I recommend it for inclusion in CVS.

The problem occurs because optimistic use of strtoul() forces negative
numbers and floats into lexical sort, instead of numeric.  (When I
developed this, I worked primarily with positive-integer spam scores
such as "79%".)

> I have the following set:
> 
>   spam "X-Spam-Status: (Yes|No), score=(-?[[:digit:]]+\.[[:digit:]]+)" "%2"
>   set sort=spam
>   set index_format="%4C %Z %{%b%d} %4H %-15.15F %4c %s"
> 
> This is to parse spamassassin headers such as this:
> 
>   X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_40,UNPARSEABLE_RELAY
>           autolearn=ham version=3.1.3
> 
> This seems to get confused about the sign of the spam score.
> Sometimes it seems to get it right.  And sometimes it does not.
> Here is an example, with the names and subjects blacked out to protect
> the innocent.
> 
>   69     Nov13  0.1 XXXXXXXX    2.6K XXXXX
>   70     Nov10  0.2 XXXXXXXX    2.5K XXXXX
>   71     Nov10 -0.2 XXXXXXXX    2.5K XXXXX
>   72     Nov10 -0.2 XXXXXXXX    0.4K XXXXX
>   73     Nov11 -0.2 XXXXXXXX    0.7K XXXXX
>   74     Nov13 -0.2 XXXXXXXX    7.0K XXXXX
>   75     Nov13 -0.2 XXXXXXXX    6.7K XXXXX
>   76     Nov10  0.5 XXXXXXXX    3.7K XXXXX
>   77     Nov10  0.5 XXXXXXXX    4.3K XXXXX
>   78     Nov14  0.5 XXXXXXXX    3.4K XXXXX
>   79     Nov11 -0.6 XXXXXXXX    1.1K XXXXX
>   80     Nov11 -0.6 XXXXXXXX    1.0K XXXXX
>   81     Nov10 -0.7 XXXXXXXX    0.8K XXXXX
>   82     Nov10 -0.7 XXXXXXXX    1.3K XXXXX
>   83     Nov13 -0.7 XXXXXXXX    2.1K XXXXX
>   84     Nov14 -0.7 XXXXXXXX    0.9K XXXXX
>   85     Nov09 -0.8 XXXXXXXX    1.0K XXXXX
>   86     Nov11  0.8 XXXXXXXX    0.3K XXXXX
>   87     Nov11  0.9 XXXXXXXX    0.9K XXXXX
>   88     Nov11  1.0 XXXXXXXX    2.9K XXXXX
>   89     Nov09  1.1 XXXXXXXX     90K XXXXX
> 
> Negative values less than -1 all seem to sort okay.  It seems to be
> only the absolute values <1 that cause issues.  Why is it sorting this
> so strangely?  Why does it appear to be taking the absolute value of
> the score but only if the score is between -1..1 but is okay outside
> that range?  This is mutt 1.5.9-2sarge2 from Debian Sarge stable.

-- 
 -D.    dgc@xxxxxxxxxxxx        NSIT    University of Chicago
CVSROOT = 
Using: /opt/bin/cvs diff sort.c
Index: sort.c
===================================================================
RCS file: /home/roessler/cvs/mutt/sort.c,v
retrieving revision 3.9
diff -u -r3.9 sort.c
--- sort.c      17 Sep 2005 20:46:11 -0000      3.9
+++ sort.c      22 Nov 2006 06:57:06 -0000
@@ -160,6 +160,7 @@
   char   *aptr, *bptr;
   int     ahas, bhas;
   int     result = 0;
+  double  difference;
 
   /* Firstly, require spam attributes for both msgs */
   /* to compare. Determine which msgs have one.     */
@@ -183,8 +184,11 @@
   /* Both have spam attrs. */
 
   /* preliminary numeric examination */
-  result = (strtoul((*ppa)->env->spam->data, &aptr, 10) -
-            strtoul((*ppb)->env->spam->data, &bptr, 10));
+  difference = (strtod((*ppa)->env->spam->data, &aptr) -
+                strtod((*ppb)->env->spam->data, &bptr));
+
+  /* map double into comparison (-1, 0, or 1) */
+  result = (difference < 0.0 ? -1 : difference > 0.0 ? 1 : 0);
 
   /* If either aptr or bptr is equal to data, there is no numeric    */
   /* value for that spam attribute. In this case, compare lexically. */