[current]
I can't really find answers to:
- What messages are considered when comparing for Junk? All the 'tagged as Junk' ones?
- What happens when I delete messages that are tagges as junk? does Moz 'forget' those examples?
- Is it better to just keep a cretain amount of example Junk around? Or should I save all Spam/Junk?
background on this: I now have an example corpus of about 1300+ Spam/Junk messages, and I noticed a degrade in detection accuracy. Actually, I had a large corpus of Junk mail I trained Moz on, and found it was overzealous. Having now marked a lot of messages as 'not junk' I see the exact oposite, it doesn't detect some obvious Junk at all...
So is it better to just have a rather small (200+) corpus of example Junk or what?
And then, is there a difference between messages not marked at all and messages marked not junk???
[ by Martin>]
[]
[]
similar entries (vs):
- ok, it's proven (# 21%)
- Mozilla Junk (# 20%)
- If you think leaving rude messages will get my attention (# 10%)
- Dave Farquhar on the new naive bayesian spam filter in Mozilla (# 9%)
similar entries (cg):
Martin Spernau
© 1994-2003
Big things to come (TM) 30th Dez 2002
Balance the consistency principle with the inconsistency principle
Oblique Strategies,
Ed.3
Brian Eno and Peter Schmidt
amazon.de Wunschliste
usefull links:
powered by SBELT