As we are seeing a lot of spam, why can't we blacklist questions with a description containing a mobile number and the string "loans" combination on this platform?
-
2Does this answer your question? Can a machine be taught to flag spam automatically?– Karl KnechtelCommented Jul 22, 2023 at 4:57
-
27We have an automated system for that; right now it is offline because of the moderator strike.– Karl KnechtelCommented Jul 22, 2023 at 4:57
-
4I understand the question as a request for implementing input validation. By doing this it wouldn't be possible to post a question, answer, comment containing spam and therefore no need to flag afterwards and delete.– U880DCommented Jul 22, 2023 at 5:13
-
12As I've explained here, there is an automated system to stop spam (including loan spam with phone numbers). However, that system is currently on strike.– cocomacCommented Jul 22, 2023 at 5:48
-
@U880D to my understanding, some forms of that also exist; however, it's quite hard to handle this sort of thing comprehensively with regex without also hitting tons of false positives. Spammers commonly exploit "unicode symmetry" attacks to create text that is readable but does not use the sequences of characters one would naturally expect it to.– Karl KnechtelCommented Jul 22, 2023 at 5:52
-
1@KarlKnechtel, "comprehensively with regex", I wouldn't use regex. "hitting tons of false positives" implemented and used naive Bayes spam filtering in the past, I haven't experienced such. But that's something the Inc. would need to implement. Furthermore, someone how has access to the whole data of the site could analyze the pattern what's makes a spammer and prevent upfront posing spam.– U880DCommented Jul 22, 2023 at 5:59
-
I don't know exactly what is implemented. I am pretty sure the site staff imagine they have a vested interest in not telling us exactly what is implemented.– Karl KnechtelCommented Jul 22, 2023 at 6:04
-
11Unrelated to the strike, Charcoal's metasmoke server is currently down (ISP issues; expected downtime several days still); but once it's back up, you can explore its search functionality. There is a rule for phone numbers (split into several reasons, "phone number detected in" ... title, question, answer? I don't remember the precise wording) and you can separately search for "loan". You need to be a registered user to use regex search, which would easily let you find the intersection of these. My suspicion is that you'll see a precision less than 100%– tripleeeCommented Jul 22, 2023 at 7:36
-
1Actually metasmoke.erwaysoftware.com/… has only TP (actual spam) hits in metasmoke. The same search on post bodies produces one false positive, but a very high detection rate also.– tripleeeCommented Jul 30, 2023 at 6:31
1 Answer
The SE spam filter is notoriously broken and has been like this for a long time.
Usually the difference is caught by the Charcoal project, but hostility from SE has caused the project to go on strike together with many other moderators and curators*. Which is very unfortunate, since there were plans to actually improve the reliability (as in availability, not accuracy) of Charcoal by increasing support from SE (where there is next to none currently, except for increased API limits). Before the AI craze, it looked like we were making progress. Unfortunately, the company has decided to scare people away instead of continuing the trend forwards.
* :
-
12Perhaps a more diplomatic articulation would be "Stack Exchange has not put a lot of effort into their own spam filter because Charcoal was handling a lot of the effort using free volunteer resources." In retrospect, perhaps we should have gone on strike sooner (ha ha, only serious).– tripleeeCommented Jul 22, 2023 at 10:05
-
13@tripleee this already is a diplomatic articulation, no need to playdown genuine harm.– AkixkisuCommented Jul 22, 2023 at 12:51
-
We have two ways. First don't let user submit the question which is more suitable i believe, other one is let the user feed the question and bot will delete it if found suspicious. So by this way we have two layers act as active, active. I am sure how it's implemented just sharing thoughts, team may be already thought of this. My question is why this is still happening as it's not good for this kind of platform.– asktyagiCommented Jul 23, 2023 at 5:27
-
4If the company wanted to run Smoke Detector, they could. The efficacy of the bot will dwindle over time without the cadre of volunteers who update its detections in response to new spam campaigns and other emerging threats, though.– tripleeeCommented Jul 23, 2023 at 7:33