Revisions to (Historical) Policy on the use of GPT detectors

added 216 characters in body

Source Link

edited Jul 27, 2023 at 7:48

19.7k
3
37
84

Thanks a lot for publishing this. It's much appreciated. When reading I wondered why you didn't take into account different possible thresholds. GPT detectors surely are tunable to some extent and this

This survey concluded that GPT detectors misclassify 32% (+/- 6%) of non-GPT posts on Stack Exchange sites as having been written by GPT.

only makes sense for a specific setting. Or was this included in the +/- 6%?

The country dependent suspension rate might not be a bias. Is the underlying assumption that otherwise all countries behave the same with regard to GPT? It may be good to explicitly state all assumptions.

The biggest mistake however seems to be to not ask for feedback before making the decision. Just imagine if you had presented the results and discussed them with the mods before making a decision. There might simply be a discussion about false positive rates now and the strike may never have happened. If anything I think this should be the take-home message: getting feedback reduces risksis often enormously helpful.

Thanks a lot for publishing this. It's much appreciated. When reading I wondered why you didn't take into account different possible thresholds. GPT detectors surely are tunable to some extent and this

This survey concluded that GPT detectors misclassify 32% (+/- 6%) of non-GPT posts on Stack Exchange sites as having been written by GPT.

only makes sense for a specific setting. Or was this included in the +/- 6%?

The biggest mistake however seems to be to not ask for feedback before making the decision. Just imagine if you had presented the results and discussed them with the mods before making a decision. There might simply be a discussion about false positive rates now and the strike may never have happened. If anything I think this should be the take-home message: getting feedback reduces risks.

Thanks a lot for publishing this. It's much appreciated. When reading I wondered why you didn't take into account different possible thresholds. GPT detectors surely are tunable to some extent and this

This survey concluded that GPT detectors misclassify 32% (+/- 6%) of non-GPT posts on Stack Exchange sites as having been written by GPT.

only makes sense for a specific setting. Or was this included in the +/- 6%?

The country dependent suspension rate might not be a bias. Is the underlying assumption that otherwise all countries behave the same with regard to GPT? It may be good to explicitly state all assumptions.

The biggest mistake however seems to be to not ask for feedback before making the decision. Just imagine if you had presented the results and discussed them with the mods before making a decision. There might simply be a discussion about false positive rates now and the strike may never have happened. If anything I think this should be the take-home message: getting feedback is often enormously helpful.

deleted 4 characters in body

Source Link

edited Jul 26, 2023 at 18:08

bobeyt6

3.7k
1
10
28

Thanks a lot for publishing this. It's much appreciated. When reading I wondered why you didn't take into account different possible thresholds. GPT detectors surely are tunable to some extent and this

This survey concluded that GPT detectors misclassify 32% (+/- 6%) of non-GPT posts on Stack Exchange sites as having been written by GPT.

only makes sense for a specific setting. Or was this included in the +/- 6%?

The biggest mistake however seems to be to not ask for feedback before making the decision. Just imagine if you would havehad presented the results and discussed them with the mods before making a decision. There might simply be a discussion about false positive rates now and the strike may never have happened. If anything I think this should be the take home-home message. Getting: getting feedback reduces risks.

Source Link

answered Jul 26, 2023 at 17:11

NoDataDumpNoContribution

19.7k
3
37
84

Thanks a lot for publishing this. It's much appreciated. When reading I wondered why you didn't take into account different possible thresholds. GPT detectors surely are tunable to some extent and this

This survey concluded that GPT detectors misclassify 32% (+/- 6%) of non-GPT posts on Stack Exchange sites as having been written by GPT.

only makes sense for a specific setting. Or was this included in the +/- 6%?

The biggest mistake however seems to be to not ask for feedback before making the decision. Just imagine you would have presented the results and discussed them with the mods before making a decision. There might simply be a discussion about false positive rates now and the strike may never have happened. If anything I think this should be the take home message. Getting feedback reduces risks.

Stack Exchange Network

Return to Answer