Revisions to GPT on the platform: Data, actions, and outcomes

edited body

Source Link

edited Jun 10, 2023 at 13:51

4.9k
21
28

The moderator who handled my flag correctly noted that this was a false positive —– clear evidence if there ever was that moderators were not simply rubber-stamping flags or triggering on insufficient evidence.

added 641 characters in body

Source Link

edited Jun 10, 2023 at 13:43

CodeCaster

4.9k
21
28

I already posted an answer to this question. In that one, in which I try to assert why people create GPT posts using GPT and why others do not find that behavior healthy for the community. I still wanted to post another one, butas I am unable to move past what we can read in this quote from the question (emphasis mine):

(emphasis mine)

Stop taking us for fools

Stop trusting your tools

Whatever your methodology is, stop using it. Start listening to your moderators and users. You are off by orders of magnitudes, you are ignoring users who slightly edit and mark up GPT-generated text in the editanswer box, you are somehow severely erring in your measurements, you are seeing (or rather not seeing) at least hundreds of GPT-generated posts per week. You have been for the better part ofhalf a year already.

My methodology

I am an English as a second language person. My native tongue is Dutch, and I have been hearing, speaking, reading and writing English for over three decades already. I love actively working with language, researching etymology, rewriting sentences twenty times to have the perfect cadence and, throw in a joke or two here or there, and so on. According to multiple people, I'm doing a terrific job posing as a native English speaker (though not audibly; it is apparently pretty hard to get rid of a Dutch accent on your own).

As stated in my other answer, I have read and vetted literally tens of thousands of posts on this network, mainly on Stack Overflow. Besides that I am an avid reader and poster of and poster on Reddit and Tweakers (a Dutch tech news site and forum).

One of the biggest signs is that it will almost never tell you that that what you're trying to do is a stupid idea, something developers can't hear often enough. It will do exactly what you ask for, and produce very readable code, often with comments explaining what that code does, and then proceed to repeat that code in the form of an explanation thereof in English. But the code will be conceptually and/or idiomatically wrong, if not syntactically.

For someone like me, it is not hard to recognize the obvious signs of ChatGPT-generated text and code. And there are dozens of us. Dozens! (There is room for magnitudes of error in this statement.)

Case studies

Case studies#1: 10 answers in 30 minutes

10 answers in 30 minutes

On May 19th, right about where your cute little graphthe GPT-suspect posts graph converges to zero, I have flagged a post from a now-deleted user with the message:

You cannot makewill be unable to convince me nor anyone else believe that this person, with their 10 answers showing typical GPT-generated prose in 30 minutes and all other signs, was not using GPT to craft their (incorrect) answers. I did not use tools to detect this, I used the heuristics outlined above (i.e. my brain).

10#2: 10 answers in 10 minutes

Two days before thatearlier, May 17th, I encountered an answer which I flagged with:

[When] the server has an outage [...] my page [...] gets a Http 503 'service unavailable' returned, and that's how it stays [...] it's on a unmanned PC [...] do

do I need [...] to redirect to a local page if, on refresh, http 503 (or other error) is returned?

They want to have some (client-side?)need code that refreshes their user-interaction-less web page when the server serving that page returns an error.

The answerer posting 10 GPT-generated answers in 10 minutes (an inhuman feat to begin with) after a nearly 2-year answering gap produced this answer (screenshot) which is basically a very generic rehash of every "how to set up error handling for ASP.NET" tutorial, entirely ignoring both the premise of the question and the question itself.

Again: extremely unlikely that this was a false positive, and that this was the only user on that day posting GPT-generated cruft.

I'm not in the SOCVR chatroom nor do I do review queues anymore, so I encounter all my questions and answers organically: from searches related to things I'm working on, or from trolling the frontpage while I'm not working.

I have (only) 21 more handled flags on ChatGPT answers, but they'rewhich are older and usually don't point to as many answers as these two flags I expanded on above. Ironically enough, most of those remaining flags were on answers posted mid April, just when your graph starts sharply going towards zero.

If I, in my casual browsing during the day (I'm not in SOCVR nor do I do review queues) encounter such users with such egregious display of using GPT, in the three weeks before when you claim the upper bound is liketo be 10-15 answers a day, then one of us must be wrong. And by now I'm pretty dang sureconfident it's not me.

I was in fact wrong at least once. I shouldn't have flagged this answer, accusing it and this other one of being GPT-generated. Both of those answers remain up and the userwriter unpunished, and rightfully so. Their writing style is what threw me off. The moderator who handled my flag correctly noted that this was a false positive—clear evidence if there ever was that moderators were not simply rubber-stamping flags or triggering on insufficient evidence.

The moderator who handled my flag correctly noted that this was a false positive — clear evidence if there ever was that moderators were not simply rubber-stamping flags or triggering on insufficient evidence.

I'm not claiming to have superhuman properties. I'm not saying I'm free from faults. I'm merely saying that there's something in your analyses causing you to severely under-report GPT-generated content, causing unsubstantiated policy changes that the community rightfully disagrees with.

I don't know what's causing thatyour data to be that off, and I don't know how to express my (and others') heuristics into definitive rules (nor whether we should publicly do so), but.. you need to reconsider your criteria.

Stop. Stop doing what you're doing. Now. ListenPlease listen to your moderators and users.

I already posted an answer to this question. In that one I try to assert why people create GPT posts and why others do not find that behavior healthy for the community, but I am unable to move past what we can read in this quote from the question:

(emphasis mine)

Stop taking us for fools

Stop trusting your tools

Whatever your methodology is, stop using it. Start listening to your moderators and users. You are off by orders of magnitudes, you are ignoring users who slightly edit and mark up GPT-generated text in the edit box, you are somehow severely erring in your measurements, you are seeing (or rather not seeing) hundreds of GPT-generated posts per week. You have been for the better part of a year already.

My methodology

I am an English as a second language person. My native tongue is Dutch, and I have been hearing, speaking, reading and writing English for over three decades already. I love actively working with language, researching etymology, rewriting sentences twenty times to have the perfect cadence and throw in a joke or two here or there, and so on. According to multiple people, I'm doing a terrific job posing as a native English speaker (though not audibly; it is apparently pretty hard to get rid of a Dutch accent on your own).

As stated in my other answer, I have read and vetted literally tens of thousands of posts on this network, mainly on Stack Overflow. Besides that I am an avid reader and poster of and on Reddit and Tweakers (a Dutch tech news site and forum).

One of the biggest signs is that it will almost never tell you that that what you're trying to do is a stupid idea, something developers can't hear often enough. It will do exactly what you ask for, and produce very readable code, often with comments explaining what that code does, and then proceed to repeat that code in the form of an explanation thereof in English.

For someone like me, it is not hard to recognize ChatGPT-generated text. And there are dozens of us. Dozens! (There is room for magnitudes of error in this statement.)

Case studies

10 answers in 30 minutes

On May 19th, right about where your cute little graph converges to zero, I have flagged a post from a now-deleted user with the message:

You cannot make me nor anyone else believe that this person, with their 10 answers showing typical GPT-generated prose in 30 minutes and all other signs, was not using GPT to craft their (incorrect) answers. I did not use tools to detect this, I used the heuristics outlined above (i.e. my brain).

10 answers in 10 minutes

Two days before that, May 17th, I encountered an answer which I flagged with:

[When] the server has an outage [...] my page [...] gets a Http 503 'service unavailable' returned, and that's how it stays [...] it's on a unmanned PC [...] do I need [...] to redirect to a local page if, on refresh, http 503 (or other error) is returned?

They want to have some (client-side?) code that refreshes their user-interaction-less web page when the server serving that page returns an error.

The answerer posting 10 GPT-generated answers in 10 minutes (an inhuman feat to begin with) after a nearly 2-year answering gap produced this answer (screenshot) which is basically a very generic rehash of every "how to set up error handling for ASP.NET" tutorial, entirely ignoring both the premise of the question and the question itself.

I have 21 more handled flags on ChatGPT answers, but they're older and usually don't point to as many answers as these two. Ironically enough, most of those were on answers posted mid April, just when your graph starts sharply going towards zero.

If I, in my casual browsing during the day (I'm not in SOCVR nor do I do review queues) encounter such users with such egregious display of using GPT, in the three weeks before when you claim the upper bound is like 10-15 answers a day, then one of us must be wrong. And I'm pretty dang sure it's not me.

I was in fact wrong at least once. I shouldn't have flagged this answer, accusing it and this other one of being GPT-generated. Both of those answers remain up and the user unpunished, and rightfully so. Their writing style threw me off. The moderator who handled my flag correctly noted that this was a false positive—clear evidence if there ever was that moderators were not simply rubber-stamping flags or triggering on insufficient evidence.

I'm not claiming superhuman properties. I'm not saying I'm free from faults. I'm merely saying that there's something in your analyses causing you to severely under-report GPT-generated content. I don't know what's causing that, and I don't know how to express my (and others') heuristics into definitive rules (nor whether we should publicly do so), but...

Stop. Stop doing what you're doing. Now. Listen to your moderators and users.

I already posted an answer to this question, in which I try to assert why people create posts using GPT and why others do not find that behavior healthy for the community. I still wanted to post another one, as I am unable to move past what we can read in this quote from the question (emphasis mine):

Stop taking us for fools

Stop trusting your tools

Whatever your methodology is, stop using it. Start listening to your moderators and users. You are off by orders of magnitudes, you are ignoring users who slightly edit and mark up GPT-generated text in the answer box, you are somehow severely erring in your measurements, you are seeing (or rather not seeing) at least hundreds of GPT-generated posts per week. You have been for half a year already.

My methodology

I am an English as a second language person. My native tongue is Dutch, and I have been hearing, speaking, reading and writing English for over three decades already. I love actively working with language, researching etymology, rewriting sentences twenty times to have the perfect cadence, throw in a joke or two here or there, and so on. According to multiple people, I'm doing a terrific job posing as a native English speaker (though not audibly; it is apparently pretty hard to get rid of a Dutch accent on your own).

As stated in my other answer, I have read and vetted literally tens of thousands of posts on this network, mainly on Stack Overflow. Besides that I am an avid reader of and poster on Reddit and Tweakers (a Dutch tech news site and forum).

One of the biggest signs is that it will almost never tell you that that what you're trying to do is a stupid idea, something developers can't hear often enough. It will do exactly what you ask for, and produce very readable code, often with comments explaining what that code does, and then proceed to repeat that code in the form of an explanation thereof in English. But the code will be conceptually and/or idiomatically wrong, if not syntactically.

For someone like me, it is not hard to recognize the obvious signs of ChatGPT-generated text and code. And there are dozens of us. Dozens! (There is room for magnitudes of error in this statement.)

Case studies

#1: 10 answers in 30 minutes

On May 19th, right about where the GPT-suspect posts graph converges to zero, I have flagged a post from a now-deleted user with the message:

You will be unable to convince me that this person, with their 10 answers showing typical GPT-generated prose in 30 minutes and all other signs, was not using GPT to craft their (incorrect) answers. I did not use tools to detect this, I used the heuristics outlined above (i.e. my brain).

#2: 10 answers in 10 minutes

Two days earlier, May 17th, I encountered an answer which I flagged with:

[When] the server has an outage [...] my page [...] gets a Http 503 'service unavailable' returned, and that's how it stays [...] it's on a unmanned PC [...]

do I need [...] to redirect to a local page if, on refresh, http 503 (or other error) is returned?

They need code that refreshes their user-interaction-less web page when the server serving that page returns an error.

The answerer posting 10 GPT-generated answers in 10 minutes (an inhuman feat to begin with) after a nearly 2-year answering gap produced this answer (screenshot) which is basically a very generic rehash of every "how to set up error handling for ASP.NET" tutorial, entirely ignoring both the premise of the question and the question itself.

Again: extremely unlikely that this was a false positive, and that this was the only user on that day posting GPT-generated cruft.

I'm not in the SOCVR chatroom nor do I do review queues anymore, so I encounter all my questions and answers organically: from searches related to things I'm working on, or from trolling the frontpage while I'm not working.

I have (only) 21 more handled flags on ChatGPT answers, which are older and usually don't point to as many answers as these two flags I expanded on above. Ironically enough, most of those remaining flags were on answers posted mid April, just when your graph starts sharply going towards zero.

If I, in my casual browsing during the day encounter users with such egregious display of using GPT, in the three weeks before when you claim the upper bound to be 10-15 answers a day, then one of us must be wrong. And by now I'm pretty confident it's not me.

I was in fact wrong at least once. I shouldn't have flagged this answer, accusing it and this other one of being GPT-generated. Both of those answers remain up and the writer unpunished, and rightfully so. Their writing style is what threw me off.

The moderator who handled my flag correctly noted that this was a false positive — clear evidence if there ever was that moderators were not simply rubber-stamping flags or triggering on insufficient evidence.

I'm not claiming to have superhuman properties. I'm not saying I'm free from faults. I'm merely saying that there's something in your analyses causing you to severely under-report GPT-generated content, causing unsubstantiated policy changes that the community rightfully disagrees with.

I don't know what's causing your data to be that off, and I don't know how to express my (and others') heuristics into definitive rules (nor whether we should publicly do so), but you need to reconsider your criteria.

Please listen to your moderators and users.

added 250 characters in body

Source Link

edited Jun 10, 2023 at 12:01

Cody Gray

64.9k
23
199
314

I was in fact wrong at least once. I shouldn't have flagged this answer answer, accusing it and point to this other one as wellof being GPT-generated. Those answerBoth of those answers remain up and the user unpunished, and rightfully so. Their writing style threw me off. The moderator who handled my flag correctly noted that this was a false positive—clear evidence if there ever was that moderators were not simply rubber-stamping flags or triggering on insufficient evidence.

Source Link

answered Jun 10, 2023 at 10:43

CodeCaster

4.9k
21
28

Loading

Stack Exchange Network

Return to Answer

Post Timeline

Stop taking us for fools

Stop taking us for fools

Stop trusting your tools

Stop trusting your tools

My methodology

My methodology

Case studies

Case studies#1: 10 answers in 30 minutes

10 answers in 30 minutes

10#2: 10 answers in 10 minutes

Stop taking us for fools

Stop trusting your tools

My methodology

Case studies

10 answers in 30 minutes

10 answers in 10 minutes

Stop taking us for fools

Stop trusting your tools

My methodology

Case studies

#1: 10 answers in 30 minutes

#2: 10 answers in 10 minutes