We've deployed machine learning auto-flagging
Given the positive reception, we've activated machine learning anti-spam's auto-flagging feature, as detailed below. Staff will be monitoring its work very closely over the next few weeks to ensure accuracy, especially with regards to automatic binding flags that result in instant deletion.
We're excited to see how big of an impact this feature makes on the site's spam influx, and we'll be back to share the data on the ML model's effectiveness in a couple months.
On behalf of the moderation tooling team, thank you all for your feedback and analysis of our work; your voice is an essential part of the work that we do!
We’ve started using a machine learning model to automatically identify, flag, and delete spam, and this site seems like a great place to activate it on. So far it’s been quite effective on Super User, which is the only site where it’s currently flagging automatically. We’ve improved the model since we last talked about it on Meta Stack Exchange, and we’d like to share its effectiveness to determine whether it should guard your site as well.
How do the anti-spam capabilities work?
When a post is created or a post author edits their post, our systems subject it to a spam evaluation. We currently subject posts to two checks, a similarity evaluation and a pass through our ML model. The ML model is trained on several cumulative years’ worth of deleted spam across the network’s data, and yields a confidence score between 0% and 100%. At a high level of “spam confidence”, we’ll raise a non-binding spam flag. This non-binding flag counts towards the four spam flags that are required for automatic deletion as spam. At a very high “spam confidence” level, we will automatically delete the post with a binding spam flag.
Automatic non-binding spam flags do not affect the post’s score and are largely invisible to non-moderators. They can be dismissed by moderators like any other flag if they’re found to be unhelpful. Automatic binding spam flags attach a unique post notice, which links to an explanatory help center article. Our hope with this help center article is that legitimate posts have a visible and easy path to undeletion with the help of a handling moderator.
For more detailed information about how ML anti-spam operates, please review the network-wide announcement.
How effective would ML anti-spam be on Server Fault?
Over our year-end break, from December 19th, 2025 to January 12th, 2026, we ran the ML model in silent observation mode across the network. During this timeframe, Server Fault observed 47 spam posts. ML anti-spam, if enabled, would’ve flagged 36 of them, with 30 of those flags being binding flags resulting in instant deletion. There would have been 2 false positive non-binding flags which a moderator would have been able to dismiss upon review. This means that ML anti-spam would’ve caught 76.5% of all spam during this time period with a 94.7% accuracy rating.
Here’s the ML model’s theoretical flagging summary data in a table:
| # Total Spam | # Autoflagged (%) | Non-binding TP flags | Binding TP Flags |
|---|---|---|---|
| 47 | 36 (76.4%) | 6 | 30 |
It is worth noting that this system will work alongside Similarity anti-spam which is already flagging on Server Fault, so reviewing that anti-spam detector’s effectiveness over the same time period feels appropriate. Since December 10th, 2025, Similarity anti-spam caught 21, or 44.6%, of spam on Server Fault. It has, so far, observed no false positives on Server Fault. While there is some detection overlap here, it’s clear that ML anti-spam detects significantly more spam, at the cost of a couple false positive non-binding flags.
Takeaways
Our analysis of this data indicates that ML anti-spam would be a great addition to Server Fault’s defenses, and should help lighten the load on the community’s flags. Do you feel differently? Do you observe any issues with the data as we’ve laid it out? Does anything else problematic stick out to you?
We’ll be monitoring this post for feedback until Wednesday, January 21st 2026. Our goal is to move forward with enabling ML anti-spam’s flagging capabilities after the end of the feedback window. If you see any serious issues that could hinder the rollout, please be sure to detail them in an answer so we can resolve them before we move forward.