Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

5
  • 16
    Very much agreed that the still privately held part of the moderation policy continues to feel very problematic, but isn't your point on "detector-ban vs. all out ban" covered by the post? If internal metrics imply that the rate of ChatGPT answers has fallen off, but that suspensions for ChatGPT answers have not, does that not cast at least potentially reasonable doubt on the suspensions themselves, regardless of the mechanisms used? I don't have the data expertise to properly analyze their methodology, but the logic used here sounds relatively sound to me on the surface. Do you disagree? Commented Jun 7, 2023 at 20:16
  • 26
    @zcoop98 - We're lacking any proof that the methods that Stack are using are inherently any more accurate than the methods that moderators are using. Stack has so far refused to compare detection reliability using known-good and known-bad data (i.e. creating the data specifically to test), so all we have is "take our word for it that our methods are more reliable". It's a "my word vs your word" situation with whose detection methods are more accurate. Commented Jun 7, 2023 at 20:44
  • 6
    @zcoop98 that's one option. The other is that the predictive power of the metrics has changed. It seems likely that the subset of accounts posting generated answers would change over time. For instance, you might see that only relatively new accounts do it now, and they might post more or less often than the previous average. It's also very likely that even rudimentary evasion measures - like editing in place after pasting, or copying select output in segments from successive runs/variations - would skew the stats they're measuring. Even a minor increase in average sophistication blows it. Commented Jun 8, 2023 at 4:00
  • 1
    @zcoop98 What you said is certainly possibly correct. Unfortunately, my concern regarding it didn't really fit into a comment, so I've explained my concern with that over in Chat (you'll have to expand some of the messages) Commented Jun 8, 2023 at 4:29
  • 3
    just a little nitpicking - Huggingface is not a detector, is just a site that provides hosting and a community for AI related technologies. So what people usually mean is that they used a detector sample hosted on hugging face Commented Jun 8, 2023 at 13:20