

I finally remembered what I wanted to write about last night, and it makes perfect sense that I would have been distracted by the destruction of last night's Valentine's Day Bash.

I have recently been having problems with extra spam in my inbox, so I went back through the docs for CRM114. It turns out that pre-training your Bayesian filter isn't the best idea. The better method, with a reported 2.1x boost to accuracy, is to start with a completely empty filter and train on errors, sequentially as they occur.

It took an incredible amount of willpower to annihilate my old filter files and start from scratch, and the accuracy hasn't quite caught up to where I'd like it to be, but the simple fact of the matter is that I'm observing slightly better results now than I had a week ago. I should probably start keeping a heuristic of this somewhere.

No comments: