notspam: a notmuch interface to spamassassin
Jameson Graef Rollins
jrollins at finestructure.net
Tue Mar 5 22:43:12 PST 2013
Hey, folks. I put together a little python program as an interface
between notmuch and spamassassin (sa) that I thought others might be
interested in:
git://finestructure.net/notspam
It's only dependencies are a running local sa daemon and python-notmuch.
It's pretty straightforward: it's just a single python script that has
two main functions 'learn' and 'tag'. 'Learn' takes a notmuch search
and pipes the resulting messages into sa (via sa-learn) to be classified
as ham or spam. 'Tag' takes a notmuch search and passes the resulting
messages through the sa classifier (via spamc) to be tagged as ham or
spam.
Here's how I've been using it:
* Tag spam manually with the tag 'spam'. It's good to have done this
for a while to build up a good amount of manual classification.
* Once you've got some manual classification, teach sa:
notspam learn spam tag:spam
notspam learn ham not tag:spam
Everything after the meat ('spam'/'ham') are the notmuch search
terms. Rerun this periodically to update, but you might want to
restrict the search a little so sa-learn doesn't eat a lot of
overhead reprocessing old messages that haven't changed
classification.
* Call 'notspam tag' in your post-new hook (all my new messages are
tagged 'new' initially):
notspam tag --spam=spamd tag:new
I give the sa-classified mail a different tag so it's easy to
distinguish what was classified by me and what was classified by sa.
Pretty simple. See 'notspam help' for more info.
Right now it's geared specifically for sa, but it would be easy to
expand it to handle arbitrary learn/classify commands. If there's any
further interest in this, I would be happy to help push on it more.
jamie.
PS: if anyone has any suggestions for Bayesian classifiers better than
sa I'm all ears. I'm not so happy with sa at the moment. It misses a
lot more spam than I would like. Maybe I just haven't tweaked it out
yet, in which case if anyone has any suggestions on how to improve sa's
classification I'm also all ears.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20130305/2709bb18/attachment-0001.pgp>
More information about the notmuch
mailing list