Home >Technology peripherals >AI >AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

王林
王林forward
2023-04-10 23:41:011014browse

How do you feel when you are innocently labeled as "cheating" by AI?

#This happened to Lucy Goetz, a high school senior. She originally wrote an original paper on socialism that got the highest score.

# However, Turnitin’s AI writing detector actually said that the end of Goetz’s paper was generated using ChatGPT.

Goetz was shocked, "I'm very happy to have a good relationship with the teachers."

#In short, fortunately the teacher understands me, otherwise I would not be able to clean myself up even if I jumped into the Yellow River.

#What’s even more surprising is that this ChatGPT detector has now been used by 2.1 million teachers.

AI Can’t Beat AI

The flagged portion of Goetz’s paper is an anomaly, but it shows that detectors sometimes make mistakes.

#Obviously, AI cannot defeat AI. This could have disastrous consequences for many students.

#To test Turnitin’s detector, reporter Geoffrey A. Fowler tested five high school students, including Goetz.

#They created 16 sample papers covering real, AI-generated, and mixed sources.

What was the result?

Turnitin’s detector had errors in at least half of the samples. It only accurately identified 6 of the articles, but failed to detect all 3 articles. Failed. This includes marking errors in 8% of Goetz's original paper.

For the remaining 7 articles, Fowler said, “I will only give it partial scores because its judgments are generally correct, but it misidentified some ChatGPTs. Generate or mix the writing portion of the source."

# However, Turnitin claims an overall accuracy of 98% for its detector. The company also said that in its own testing, situations like Goetz's paper (i.e., false positives) occurred less than 1 percent of the time.

AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

Turnitin’s AI detector detail page assigns an overall score and highlights suspected AI-generated sentences. The company said it intentionally marked passages suspected of being generated by AI in blue instead of red and linked teacher resources below the score.

Rebecca Dell, an AP English teacher at Goetz in Concord, Calif., said Turnitin’s system of marking AI text doesn’t always work, which is concerning.

#Unlike plagiarism accusations, AI cheating does not have source documents as evidence, which is the easiest way for teachers to be biased against students.

AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

Maybe not everyone is as lucky as Goetz.

Goetz said, “For students, being accused of AI cheating is especially scary. Unless your teacher understands your writing style or is very fond of you. Trust, otherwise there is no way to prove that you are not cheating."

Why AI detection is so difficult

Since the advent of ChatGPT, it has been used by students and teachers in many universities in daily homework and teaching .

However, if not restricted, ChatGPT will become the most powerful cheating tool in history, helping students write homework and even complete exam papers.

#In order to counter reconnaissance, a simple and easy-to-use detector has become what teachers look forward to. A 22-year-old Princeton University student, Edward Tian, ​​developed a detector by himself-GPTZero.

#Even, OpenAI officially announced the launch of a new tool, a file detector called AI Text Classifier.

#However, the performance of these detectors is not satisfactory.

#Detecting AI-created content sounds simple. But when you are given a handwritten email and an email generated by ChatGPT, it is almost impossible for us to tell the difference.

Eric Wang, Turnitin’s vice president of artificial intelligence, said that using software to detect artificial intelligence writing involves statistics. From a statistical perspective, what distinguishes artificial intelligence from humans is that it is extremely stable at the average level.

To put it bluntly, the AI ​​level is very stable. However, this is not actually the case.

"A system like ChatGPT is like an advanced version of autocomplete, looking for the next most likely word to write. That's actually why it's The reason why it reads so naturally. AI writing is the most likely subset of human writing."

Turnitin's detector will then "identify that writing is too consistent average situation". The challenge is that sometimes human writing can actually seem average.

In economics, mathematics, and lab reports, students tend to follow a set writing style, which means they are more likely to be mistaken for AI writing.

This is probably why Turnitin mistakenly flagged Goetz's paper because its content dealt with economics.

Wang said Turnitin worked hard to fine-tune its system to require a higher level of confidence before labeling a sentence as AI-generated to make mistakes in this regard.

# also said that his own software has made great progress. “When I first tested Goetz’s paper in late January, the software identified about 50% of them as AI-generated. Turnitin ran my sample through its system again in late March, and at that time only 8 of Goetz’s papers were flagged. % is generated by AI."

Turnitin's detector also faces other important technical limitations.

AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

Of the 6 samples it detected completely correctly, they were all clearly 100% student work, or created by Generated by ChatGPT.

#But when tested with papers that mixed AI and human sources, it often misidentified individual sentences, or missed the human part entirely. And it was unable to detect traces of ChatGPT in papers processed through Quillbot, a rewriting program that can recombine sentences.

#In addition, Turnitin’s detector may already lag behind the current state of artificial intelligence technology.

Because take ChatGPT for example, it has now been blessed by GPT-4 and has more creative and stylized capabilities.

Nvidia scientist Jim Fan said, "I don't think the detector is reliable in the long term." Artificial intelligence will get better and better and write in an increasingly human-like manner. It’s safe to say that these language model quirks will diminish over time.

AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

Is it a good idea to use AI for detection?

Why release an AI detector when there is a potential for error (even if only 1%)?

#"Teachers want to have a deterrent effect," Chechitelli said. However, some educators worry that this may actually increase student stress levels.

AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers

Turnitin has activated this ChatGPT test for approximately 10,700 secondary and higher education institutions on April 4 The machine provides "AI-generated" scoring and sentence-by-sentence analysis of student assignments.

Mitchel Sollenberger, vice provost for digital education at the University of Michigan-Dearborn, asked Turnitin not to activate AI detection for his campus in the initial release.

He is worried that teachers who test about 20,000 student papers each semester through Turnitin may receive false positives, leading to unfounded claims. Academic Integrity Survey. Teachers are not expected to become experts in third-party software systems.

The above is the detailed content of AI cannot beat AI! The ChatGPT detector frequently accuses innocent students and is used by 2.1 million teachers. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete