TL;DR
Our anti-spam measure, which leverages similarities in recently deleted spam to help delete new spam, is now enabled network-wide. Our findings during its week-long trial on Super User indicate that it’s a boon to the network and worth activating on all sites. We have future improvements in progress for how we surface recent anti-spam actions to interested parties, and we’ll continue to monitor its activities to ensure it’s maintaining the level of accuracy we expect.
We’re graduating v1 of our anti-spam measure network-wide
We’re taking the training wheels off of our anti-spam measure and are releasing it to all sites today! We’ve observed its behavior on Super User and determined that it’s doing a great enough job to be trusted with overseeing the network. We have some data from its first week on Super User that we’d like to share with you, and we also have some data on posts that it would have flagged that we’d like to surface.
First, let’s make sure we’re all on the same page…
What is this anti-spam measure?
This background process compares new posts and author edits to posts against recently-deleted spam, and if it’s highly similar, this process will issue a spam flag on it.
How does it work?
We create pools of recently-deleted spam that the network receives, and then compare all new posts and their author-edit activity against them. There are two pools, the Per-site Pool and the Network-wide Pool, which retain the 100 / 500 recently-deleted spam posts respectively.
When a new question or answer is posted or edited by its author, they are measured against these pools’ posts using a similarity algorithm. We take the highest similarity measurement between the most similar posts in either pool and use that as its running spam similarity score. We lump these posts in three buckets of similarity: “Low Risk”, “Medium Risk”, and “High Risk”. For "Low Risk" posts, we do nothing. For the other two, we take proactive action against them to assist in deletion.
We consider posts that are considerably similar to recent spam to also be spam, label them as “Medium Risk”, and raise a non-binding spam flag from the Community bot user. This spam flag does not impose any negative effects that traditional spam flags do such as an implicit downvote. It appears as another spam flag in moderators' flag dashboard and helps towards the spam flag threshold for deletion, but it has no other ill effects. Our hope here is to expedite review and deletion if appropriate in these cases, and we consider some false positives acceptable here.
We consider posts that are extremely similar or identical to recent spam as “High Risk”, and in such a case we will raise a binding spam flag on the post, which unilaterally deletes it. Our hope is that we have as close to a 0% false positive rate on these flags as possible.
The exact thresholds for these risk assessment levels are adjustable, so if we notice a sizable number of false positives in a particular threshold, we’re able to adjust these on the fly. We are capable of adjusting these on a per-site basis, so if a particular community is observing a disproportionately high number of false positives, we can respond by adjusting these thresholds accordingly.
Let’s review the data!
As mentioned, we held a week-long trial for this anti-spam process on Super User. During this trial, 57 posts were deleted as spam on Super User in total. Of those, this anti-spam system unilaterally deleted 7 posts with a binding spam flag and raised a non-binding spam flag against 12 posts. Many of the posts where we raised a non-binding flag were followed up with autoflags from the SmokeDetector project. We had one lonely false positive over the course of this trial run, and it was a non-binding flag that a moderator dismissed as helpful. This gives us a 95% TP rate so far.
Here’s that data in a table:
| Total Spam on SU during trial | # Autoflagged (%) | Non-binding flags | Binding Flags |
|---|---|---|---|
| 57 | 19 (33.33%) | 13 (1 FP) | 7 |
Our data goes further than this, though. A day prior to enabling this measure on Super User, we also enabled spam similarity assessments on all posts throughout the entire network, we simply didn’t raise any flags on them. This data is more interesting and relevant to a network-wide rollout, so let’s dive in.
Network-wide data
We turned on similarity assessments network-wide on October 21st, 2025, a couple days before the Super User rollout. For the purpose of this data, I’m considering posts that had a spam flag cast against them and they were later deleted by the Community user. Spam is a spectrum, so this doesn’t necessarily account for edge-cases.
Since turning on similarity evaluations, ~590 spam posts have been deleted in total. Were this anti-spam measure enabled network-wide, it would have raised 66 non-binding flags and 34 binding flags on those posts. In other words, we would have determined 16.9% of spam over the course of the last couple weeks as worthy of a spam flag. This number seems a bit low, but remember that we’re trying to ensure our false positive binding flag rate is as close to 0 as possible. We’re also trying to focus on spam recidivism, such as the spam that Super User often sees, as opposed to novel spam that you might see elsewhere on the network. We suspect that, because Super User is a frequent target of recurring spam, their autoflag percentage is higher.
We also would have raised 33 non-binding spam flags on posts that were determined to not be spam across the network. While these are false positives, these spam flags would be easily dismissed by a moderator’s manual review, and make up 24.75% of all flags we would have raised across the network. It’s possible that, upon investigation, these are posts we would consider to be spam or spam-adjacent. We consider this number to be an acceptable amount, especially considering that these non-binding spam flags do not automatically downvote the post, so there are no negative consequences simply to raising them, only a labor cost to dismiss them. Additionally, as mentioned previously, we can always adjust the thresholds if certain sites exhibit a disproportionately high false positive rate.
We observed that no false positive binding flags would have been cast during this period of time. This is our target number, so that's great!
Here’s that data in a table:
| Total Network Spam | Potential # Autoflagged (%) | # Non-binding | # Binding | FP Non-binding |
|---|---|---|---|---|
| 590 | 100 (16.9%) | 66 | 34 | 33 |
What’s changed since the initial announcement?
We’ve made some changes about this initiative over the course of the last couple of weeks, some of which we’ve mentioned in an edit to our previous announcement, but we’d like to go over them now.
First, we’ve introduced a recently auto-flagged spam dashboard for moderators to view. Moderators can access this via their “Links” page in their moderator dashboard. The URL for mods is (root URL)/spam/recent-auto-flagged.
This shows a list of all posts that we’ve, at a minimum, raised a non-binding spam flag on. Moderators can use this dashboard to keep an eye on posts that we’ve taken action against and, if necessary, reverse them by clearing spam or abusive flags on the post in that post’s “Mod” menu options.
We’re still considering opening up visibility for this dashboard to other interested parties such as 10k+ users or other spam-fighting users. We’ll wrap back around on that when we’ve come to a decision, but if you have a great idea for who should be able to see this, please feel free to mention it.
Here is how the dashboard appears for Super User:
We’ve also added a post notice for posts that we cast a binding spam flag on. Our hope with this is that unilateral deletions have a route for any onlookers to raise awareness to a moderator for review and potential overturning of the deletion:
Moderators will also be able to see entries in a post’s timeline view that details spam risk assessments. These are generated when a post is created and when the author of the post edits it. The design of these entries is currently a WIP, but we wanted to ensure moderators were aware of our assessments about why a post was flagged or not. Here is how that timeline entry looks:
What’s next?
We’re going to be iterating on this initiative for a while longer. First and foremost, we’re going to be monitoring the anti-spam measure network-wide to ensure that we have a very high true positive binding flag rate. We’re also going to be making some improvements to the recent auto-flagged dashboard for mods so that they can surface recent activity from this system that is relevant to their needs in protecting their community. A help center article for auto-flagged spam is also underway that will describe recourse for recipient users as well as portions of the explanation above for curious users and moderators. Further detection methods that we can use in concert with recent spam similarity are also underway, so stay tuned for that!
We’re very proud of the work that we’ve accomplished with this so far, and we hope to be able to expand it to capture a higher volume of spam that makes it past our other blocks. Our goal is to reduce the amount of labor involved on the community’s part in removing posts that don’t really require human intervention.
We’d also like to toss a “thank you” to all who weighed in on the original announcement and provided some truly great, actionable feedback on our current plans. You all pointed out very important points of contention that we either acted on or have plans to address. Feel free to leave more feedback about this initiative below and we’ll do our best to address all of you.


