<<< Date Index >>>     <<< Thread Index >>>

Re: Is predictable spam filtering a vulnerability?



Andrew Hunter wrote:

I think spam filters arn't the solution to the spam problem. If someone gets 200 spam emails aday then what use is a spam filter telling them the email was rejected? The user will end up not looking at the list of rejected emails because it's sooo big.

Filtering certain works is also bad aswell eg "penis", "viagra".
It can easyly be evoided:
Email "Free penis enlargement pills" - Would be filterd
Email "Free pen is enlargement pills" - Wouldn't be filtered

Sounds like you don't have much experience with spam filters.

I use the Bayesian spam filter built into Mozilla. On average, I get:

   * 95 legitimate mails per day
   * 180 additional spam per day
   * about 10 spam per day end up in the legitimate box, and the
     remaining 170 are filtered into the Junk folder, for periodic
     inspection
   * about 1 legitimate mail per month is mis-classified and put in the
     Junk folder
         o zero of them are critical, as the spam filter automatically
           does not Junk anything from anyone in my address book

Note that spammers *do* use hacks like "Free pen is enlargement" (and a broad variety of other cute typos) and the Bayesian filter catches on to them very quickly.

So in order to be effective it has to look for variations on the works
For example "penis" it could look for "P E N I S", "peni$" etc...

This is when the problems start. I get sent 200 spam emails the rejected emails log is huge, i can't be bothed to look through it, it'll take tooo long, but it has removed an important email.

Email "Dear Andiroo, I have found your pen, it was under my desk. You PEN IS now in the top draw of your desk".

Ok i lost my sepcial pen, my friend has found it but look "PEN IS" is like "PENIS" so it's been taken by the spam filter.

The Bayesian filter is not fooled by these issues either way. I can say "penis" in an e-mail and it will not get filtered, because the scoring system balances the total score.

My solution for spam:
I think there should be a huge database on spam emails, just like an anti virus scanner but for spam. I think it is that simple have an anti-virus but for spam, i am sure that if i get a spam email someone else will have exactly the same email so if i can submit it to the database and it's added to it quickly so everyone can get the updates then there would be no problem, but there is soooo much spam out there we would for ever have to update or ever growing in size databases.

That has been tried and failed a long time ago. The problem is that the spammers caught on to it quickly, and started adding random junk to their mails just so that no two of them would have the same checksum. That is why you see random junk characters in the headers and bodies of spams.

I think this would eliminate alot of spam, I have ran out of ideas for preventing spam emails, so what other effective solutions already out there?

I think if the penalty for spamming was having your head mounted on a stick then there would be a lot less spam :)

Crispin

--
Crispin Cowan, Ph.D.  http://immunix.com/~crispin/
CTO, Immunix          http://immunix.com