Update: We've deployed machine learning auto-flagging
Given the positive reception, we've activated machine learning anti-spam's auto-flagging feature, as detailed below. Staff will be monitoring its work very closely over the next few weeks to ensure accuracy, especially with regards to automatic binding flags that result in instant deletion.
We're excited to see how big of an impact this feature makes on the site's spam influx, and we'll be back to share the data on the ML model's effectiveness in a couple months.
On behalf of the moderation tooling team, thank you all for your feedback and analysis of our work; your voice is an essential part of the work that we do!
Original: Should Stack Overflow use machine learning to flag spam automatically?
In the spirit of reducing moderator workload, we’ve started using a machine learning model to automatically identify, flag, and delete spam. So far it’s been extremely effective on Super User, having flagged 80% of all spam on that site since it was activated on December 10th, 2025. After improving the model and running it in evaluation mode over our year-end break, we’d like for you to review the data concerning its accuracy and earn your trust to activate its flagging capability on Stack Overflow as well.
How do the anti-spam capabilities work?
When a post is created or a post author edits their post, our systems subject it to a spam evaluation. We currently subject posts to two checks, a similarity evaluation and a pass through our ML model. The ML model is trained on several cumulative years’ worth of deleted spam across the network’s data, and yields a confidence score between 0% and 100%. At a high level of “spam confidence”, we’ll raise a non-binding spam flag. This non-binding flag counts towards the four spam flags that are required for automatic deletion as spam. At a very high “spam confidence” level, we will automatically delete the post with a binding spam flag.
Automatic non-binding spam flags do not affect the post’s score and are largely invisible to non-moderators. They can be dismissed by moderators like any other flag if they’re found to be unhelpful. Automatic binding spam flags attach a unique post notice, which links to an explanatory help center article. Our hope with this help center article is that legitimate posts have a visible and easy path to undeletion with the help of a handling moderator.
For more detailed information about how ML anti-spam operates, please review the network-wide announcement.
How effective would ML anti-spam be on Stack Overflow?
We ran our latest ML model in silent evaluation mode on Stack Overflow through the holidays, and the results are quite impressive. Between December 19th, 2025 and January 9th, 2026, there were 731 instances of spam on Stack Overflow. ML anti-spam would’ve identified and flagged 468 of them, with 24 false positive non-binding flags and 7 false positive binding flags. This represents a 94% accuracy rating, and if flagging were enabled, we would have flagged 64% of all spam that was posted, with 80% of caught spam being instantly deleted by a binding flag.
Here’s the ML model’s theoretical flagging summary data in a table:
| # Total Spam | # Autoflagged (%) | Non-binding TP flags | Binding TP Flags |
|---|---|---|---|
| 731 | 468 (64%) | 94 | 374 |
It is worth noting that this system will work alongside Similarity anti-spam which is already flagging on Stack Overflow, so reviewing that anti-spam detector’s effectiveness over the same time period feels appropriate. Since December 10th, 2025, Similarity anti-spam caught 100, or 13.7%, of spam on Stack Overflow, with 14 false positive non-binding flags and no false positive binding flags. This represents an 87% accuracy rating. While there is some detection overlap here, it’s clear that ML anti-spam detects significantly more spam and is more accurate when doing so.
Takeaways
Our analysis of this data is that the ML anti-spam model would be a boon if deployed to Stack Overflow, and we’d like for you to take a peek and ensure that it’s something you’d feel comfortable if it were guarding your site. Do you have any concerns with its effectiveness as we’ve laid out? Do you have any improvements we can make to the user experience if we accidentally flag something incorrectly?
We’ll be monitoring this post for feedback until Wednesday, January 21st 2026. Our goal is to move forward with enabling ML anti-spam’s flagging capabilities after the end of the feedback window. If you see any serious issues that could hinder the rollout, please be sure to detail them in an answer so we can resolve them before we move forward.