Why can't we blacklist questions with a description containing a mobile number and the string "loans"? [duplicate]

Question

As we are seeing a lot of spam, why can't we blacklist questions with a description containing a mobile number and the string "loans" combination on this platform?

Does this answer your question? Can a machine be taught to flag spam automatically? — Karl Knechtel, Commented Jul 22, 2023 at 4:57
We have an automated system for that; right now it is offline because of the moderator strike. — Karl Knechtel, Commented Jul 22, 2023 at 4:57
I understand the question as a request for implementing input validation. By doing this it wouldn't be possible to post a question, answer, comment containing spam and therefore no need to flag afterwards and delete. — U880D, Commented Jul 22, 2023 at 5:13
As I've explained here, there is an automated system to stop spam (including loan spam with phone numbers). However, that system is currently on strike. — cocomac, Commented Jul 22, 2023 at 5:48
@U880D to my understanding, some forms of that also exist; however, it's quite hard to handle this sort of thing comprehensively with regex without also hitting tons of false positives. Spammers commonly exploit "unicode symmetry" attacks to create text that is readable but does not use the sequences of characters one would naturally expect it to. — Karl Knechtel, Commented Jul 22, 2023 at 5:52
@KarlKnechtel, "comprehensively with regex", I wouldn't use regex. "hitting tons of false positives" implemented and used naive Bayes spam filtering in the past, I haven't experienced such. But that's something the Inc. would need to implement. Furthermore, someone how has access to the whole data of the site could analyze the pattern what's makes a spammer and prevent upfront posing spam. — U880D, Commented Jul 22, 2023 at 5:59
I don't know exactly what is implemented. I am pretty sure the site staff imagine they have a vested interest in not telling us exactly what is implemented. — Karl Knechtel, Commented Jul 22, 2023 at 6:04
Unrelated to the strike, Charcoal's metasmoke server is currently down (ISP issues; expected downtime several days still); but once it's back up, you can explore its search functionality. There is a rule for phone numbers (split into several reasons, "phone number detected in" ... title, question, answer? I don't remember the precise wording) and you can separately search for "loan". You need to be a registered user to use regex search, which would easily let you find the intersection of these. My suspicion is that you'll see a precision less than 100% — tripleee, Commented Jul 22, 2023 at 7:36
Actually metasmoke.erwaysoftware.com/… has only TP (actual spam) hits in metasmoke. The same search on post bodies produces one false positive, but a very high detection rate also. — tripleee, Commented Jul 30, 2023 at 6:31

Mast · Accepted Answer · 2023-07-22 10:17:26Z

39

The SE spam filter is notoriously broken and has been like this for a long time.

Usually the difference is caught by the Charcoal project, but hostility from SE has caused the project to go on strike together with many other moderators and curators^*. Which is very unfortunate, since there were plans to actually improve the reliability (as in availability, not accuracy) of Charcoal by increasing support from SE (where there is next to none currently, except for increased API limits). Before the AI craze, it looked like we were making progress. Unfortunately, the company has decided to scare people away instead of continuing the trend forwards.

^* :

edited Jul 22, 2023 at 10:17

answered Jul 22, 2023 at 8:30

Mast

7,1984 gold badges26 silver badges50 bronze badges

12

Perhaps a more diplomatic articulation would be "Stack Exchange has not put a lot of effort into their own spam filter because Charcoal was handling a lot of the effort using free volunteer resources." In retrospect, perhaps we should have gone on strike sooner (ha ha, only serious).
– tripleee
Commented Jul 22, 2023 at 10:05
13

@tripleee this already is a diplomatic articulation, no need to playdown genuine harm.
– Akixkisu
Commented Jul 22, 2023 at 12:51
We have two ways. First don't let user submit the question which is more suitable i believe, other one is let the user feed the question and bot will delete it if found suspicious. So by this way we have two layers act as active, active. I am sure how it's implemented just sharing thoughts, team may be already thought of this. My question is why this is still happening as it's not good for this kind of platform.
– asktyagi
Commented Jul 23, 2023 at 5:27
4

If the company wanted to run Smoke Detector, they could. The efficacy of the bot will dwindle over time without the cadre of volunteers who update its detections in response to new spam campaigns and other emerging threats, though.
– tripleee
Commented Jul 23, 2023 at 7:33

Add a comment |

Stack Exchange Network

Why can't we blacklist questions with a description containing a mobile number and the string "loans"? [duplicate]

1 Answer 1

Linked

Hot Network Questions

Why can't we blacklist questions with a description containing a mobile number and the string "loans"? [duplicate]

1 Answer 1

Linked

Related

Hot Network Questions