Re: sort-mailbox by spam tag score sorting strangeness
The attached patch resolves the problem presented below in mutt-users.
I recommend it for inclusion in CVS.
The problem occurs because optimistic use of strtoul() forces negative
numbers and floats into lexical sort, instead of numeric. (When I
developed this, I worked primarily with positive-integer spam scores
such as "79%".)
> I have the following set:
>
> spam "X-Spam-Status: (Yes|No), score=(-?[[:digit:]]+\.[[:digit:]]+)" "%2"
> set sort=spam
> set index_format="%4C %Z %{%b%d} %4H %-15.15F %4c %s"
>
> This is to parse spamassassin headers such as this:
>
> X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_40,UNPARSEABLE_RELAY
> autolearn=ham version=3.1.3
>
> This seems to get confused about the sign of the spam score.
> Sometimes it seems to get it right. And sometimes it does not.
> Here is an example, with the names and subjects blacked out to protect
> the innocent.
>
> 69 Nov13 0.1 XXXXXXXX 2.6K XXXXX
> 70 Nov10 0.2 XXXXXXXX 2.5K XXXXX
> 71 Nov10 -0.2 XXXXXXXX 2.5K XXXXX
> 72 Nov10 -0.2 XXXXXXXX 0.4K XXXXX
> 73 Nov11 -0.2 XXXXXXXX 0.7K XXXXX
> 74 Nov13 -0.2 XXXXXXXX 7.0K XXXXX
> 75 Nov13 -0.2 XXXXXXXX 6.7K XXXXX
> 76 Nov10 0.5 XXXXXXXX 3.7K XXXXX
> 77 Nov10 0.5 XXXXXXXX 4.3K XXXXX
> 78 Nov14 0.5 XXXXXXXX 3.4K XXXXX
> 79 Nov11 -0.6 XXXXXXXX 1.1K XXXXX
> 80 Nov11 -0.6 XXXXXXXX 1.0K XXXXX
> 81 Nov10 -0.7 XXXXXXXX 0.8K XXXXX
> 82 Nov10 -0.7 XXXXXXXX 1.3K XXXXX
> 83 Nov13 -0.7 XXXXXXXX 2.1K XXXXX
> 84 Nov14 -0.7 XXXXXXXX 0.9K XXXXX
> 85 Nov09 -0.8 XXXXXXXX 1.0K XXXXX
> 86 Nov11 0.8 XXXXXXXX 0.3K XXXXX
> 87 Nov11 0.9 XXXXXXXX 0.9K XXXXX
> 88 Nov11 1.0 XXXXXXXX 2.9K XXXXX
> 89 Nov09 1.1 XXXXXXXX 90K XXXXX
>
> Negative values less than -1 all seem to sort okay. It seems to be
> only the absolute values <1 that cause issues. Why is it sorting this
> so strangely? Why does it appear to be taking the absolute value of
> the score but only if the score is between -1..1 but is okay outside
> that range? This is mutt 1.5.9-2sarge2 from Debian Sarge stable.
--
-D. dgc@xxxxxxxxxxxx NSIT University of Chicago
CVSROOT =
Using: /opt/bin/cvs diff sort.c
Index: sort.c
===================================================================
RCS file: /home/roessler/cvs/mutt/sort.c,v
retrieving revision 3.9
diff -u -r3.9 sort.c
--- sort.c 17 Sep 2005 20:46:11 -0000 3.9
+++ sort.c 22 Nov 2006 06:57:06 -0000
@@ -160,6 +160,7 @@
char *aptr, *bptr;
int ahas, bhas;
int result = 0;
+ double difference;
/* Firstly, require spam attributes for both msgs */
/* to compare. Determine which msgs have one. */
@@ -183,8 +184,11 @@
/* Both have spam attrs. */
/* preliminary numeric examination */
- result = (strtoul((*ppa)->env->spam->data, &aptr, 10) -
- strtoul((*ppb)->env->spam->data, &bptr, 10));
+ difference = (strtod((*ppa)->env->spam->data, &aptr) -
+ strtod((*ppb)->env->spam->data, &bptr));
+
+ /* map double into comparison (-1, 0, or 1) */
+ result = (difference < 0.0 ? -1 : difference > 0.0 ? 1 : 0);
/* If either aptr or bptr is equal to data, there is no numeric */
/* value for that spam attribute. In this case, compare lexically. */