Add layered spam defense with Bayesian filter and first-post confirmation#963
Add layered spam defense with Bayesian filter and first-post confirmation#963sdeibel wants to merge 1 commit into
Conversation
…tion Adds a multi-layered spam defense system: - Dual Bayesian classifier (spam + ham models) with lazy loading, thread-safe operation, and fail-open design - First-post email confirmation for watched users: post is held until the user clicks a confirmation link - Optional moderator queue after email confirmation - Silent deletion mode for obvious spam (spam-only, no ham match) - Incremental learning: spam model updated when posts are marked as spam, ham model updated when posts are approved by moderators - Management commands for training and cleanup All features are opt-in via livesettings with safe defaults (disabled). Replaces the existing spam checker calls in ask/answer/comment views with the dual Bayesian check while preserving the original spam checker as a fallback.
evgenyfadeev
left a comment
There was a problem hiding this comment.
Needs some discussion before acting on this PR.
| settings.register( | ||
| livesettings.BooleanValue( | ||
| SPAM_DEFENSE, | ||
| 'FIRST_POST_MODERATE_AFTER_CONFIRMATION', |
There was a problem hiding this comment.
There is a measure that all posts of "watched" users are pre-moderated, if the moderation mode is "premoderation"; it's not clear how this would be compatible with the proposed "FIRST_POST_MODERATE_AFTER_CONFIRMATION" - what if the moderation mode is "premoderation" and this setting is "False"?
| # Spam only, no ham match | ||
| if is_first_post and user.is_watched(): | ||
| if askbot_settings.BAYESIAN_SPAM_SILENT_DELETE: | ||
| user.delete() |
There was a problem hiding this comment.
This might be too eager to allow the machine delete user accounts, unless the spam classification is ultra-reliable. Also - I've noted in the PR 964 - perhaps it would be better to delete accounts automatically after some time - not instantly?
| confirmation = PostConfirmation(post=post, user=user) | ||
| confirmation.save() | ||
|
|
||
| post.approved = False |
There was a problem hiding this comment.
Not a huge issue - a minor nitpick. I mostly commented this for myself - I can later resolve these.
I think this method should not modify the post - the function is called "_send_first_post_confirmation" and modification of post attributes would be an unexpected side-effect.
| # Also mark thread unapproved for questions | ||
| if post.post_type == 'question': | ||
| post.thread.approved = False | ||
| post.thread.save(update_fields=['approved']) |
| revision = post.get_latest_revision() | ||
| if revision: | ||
| revision.approved = False | ||
| revision.save(update_fields=['approved']) |
| recipient_list=[user.email], | ||
| ) | ||
|
|
||
| request.user.message_set.create( |
There was a problem hiding this comment.
this is also a non-email related side effect (but the function could be renamed "notify ..." then it would be ok.
|
Q: Is the idea to have users confirm their first post that adding friction would reduce the amount of spam? The email address confirmation is already implemented - so it seems that this measure is aiming to make it harder to make/automate the first post. This is easily bypassable if spammers can already automate email confirmations. Q: How does this Bayesian filter compare to what there is on askbot hosting - you were using it for a while so I'd guess you'd know that? I'm curious how the efficacy of a simpler classifier compares with a transformer based model (I've used a small model (a pre-trained bert transformer with a grafted classifier head, which I later fine-tuned on several thousand samples of spam and ham; this fine-tuning took 30 minutes on CPU and the accuracy on my test set was around 99%). I don't have problems sharing it along with the weights. Issue: This change AFAIK does not seem to allow using alternative spam (and ham - not existent in the master branch - it would be a new feature) checkers, unlike the pre-existing implementation. The spam checker must be replaceable by configuration. Issue: Spam and Ham classifications serve distinct purposes, actons on spam and ham classifiers are different (spam - delete and save in spam samples, ham - accept or place on moderation queue) and I think it would be good to decouple these two concerns. |
|
Q1: Yes, because bots were getting through signup and submitting spam which we then had to moderate. I was trying to reduce the moderator burden. Bots don't seem to be written to do the second confirmation. Yes, they probably could be later, particularly when powered by AI, but my focus was just to get our site working and manageable so I didn't go further than that. Q2: Sorry, I don't feel like I have enough data to compare the two Bayesian filters. So far seems to be working, but our site is not that high traffic when it comes to real people posting real content, and the rest is so far going away on its own. I should probably add more monitoring, although again I was trying to make it low maintenance and figured people with problems would contact us by email. Issue 1: Good point. I was focusing on getting our site working. If there's a good way to support other spam solutions, that would be fine. I thought that was already there, but maybe ended up implementing this too independently. Issue 2: Hmm, I'm seeing these as one system. It's based on how I've been filtering email for 30 years and it has always worked feed both filters and have them work together to make the decision of what is spam or not. Without the ham filter, it really doesn't work well, as I suspect it also wouldn't without the spam filter, of course. Sorry for my slow replies; I'm very busy at the moment. |
Adds a multi-layered spam defense system:
All features are opt-in via livesettings with safe defaults (disabled). Replaces the existing spam checker calls in ask/answer/comment views with the dual Bayesian check while preserving the original spam checker as a fallback.