Gen AI ‘detection’ software ‘not yet reliable enough’

By SHANNON WELLS

Soon after the Ad Hoc Committee on Generative AI in Research and Teaching formed, requests began pouring in to take part in panels, presentations and workshops.

One such event found John Radzilowicz, the Teaching Center’s interim director of teaching support, teaming with Diane Litman at the School of Computing and Information (SCI) to talk about how generative AI can be used, as well as equity issues, including AI-use “detection” software as a counterbalance to ChatGPT’s popularity among students.

“We jumped in to explore some of the equity issues that are concerns, and this led us to … sort of a leadership role in (AI detection platforms),” Radzilowicz said. “And the sort of Cold War approach of an arms race between AI chatbots and AI detectors began, probably a second after ChatGPT hit.”

Looking at Gen AI detector platform GPTZero and other tools, the committee “decided very quickly that with our experimentation and talking to (other universities), we were seeing a high false-positive rate,” Radzilowicz said.

Meanwhile, the Turnitin software platform Pitt had used for more than 10 years added a Gen AI detection feature.

In an April 2022 column published in the University Times, J.D. Wright, a consultant at the Teaching Center, said Turnitin’s feature — rather than truly “detect” plagiarism — actually “compares student submissions to works in its database of material to find potentially problematic similarities,” generating an “Originality Report” based on the comparison.

Initially, the feature provided no control at the University level to turn on or off. “In other words, faculty immediately had this pop up on their dashboard as something that they had available to us (that claimed) a 98 percent to 99 percent success rate in terms of positive identifications of AI,” Radzilowicz noted.

Skeptical of the lofty claims, the committee took the position of “Well, we can’t stop you. It’s there. But again, we do not support the use of this, and we’re not going to recommend it for anyone’s use in class,” he said of AI detection features by Turnitin and others. “It’s just not ready for prime time.”

The committee’s first stance made Pitt “one of the first universities actually to get out there and take a position like that,” Radzilowicz said.

In June, Turnitin publicly acknowledged the AI detection tool had a higher false positive rate than it previously asserted. Turnitin subsequently updated the application, allowing the Teaching Center to disable the application within the suite of tools.

“Based on our professional judgment, the Teaching Center has concluded that current AI detection software is not yet reliable enough to be deployed without a substantial risk of false positives and the consequential issues such accusations imply for both students and faculty,” read a statement on the center’s Gen AI Resource Page. “Use of the detection tool at this time is simply not supported by the data and does not represent a teaching practice that we can endorse or support.

“For all these reasons, the Teaching Center will disable the AI detection tool in Turnitin, effective immediately,” the statement said.

Shannon O. Wells is a writer for the University Times. Reach him at shannonw@pitt.edu.

 

Have a story idea or news to share? Share it with the University Times.

Follow the University Times on Twitter and Facebook.