OpenAI Has A Tool That Can Tell If You Use ChatGPT To Cheat, But It Won't Release It

By Chris Smith Aug. 5, 2024 10:19 am EST

OpenAI

OpenAI released the first public version of ChatGPT in November 2022. Soon after that, schools started banning it out of fear that students would use the AI to cheat. The main problem is ChatGPT's ability to generate text on any topic within seconds, which opened an avenue for cheating. Students could create papers on anything and turn them in without really worrying about being caught. There was no tool from OpenAI that could identify AI-generated text.

OpenAI has developed a ChatGPT "text watermarking" tool in the two years since then. A report says the tool has been ready for at least a year and can detect ChatGPT text with 99.9% accuracy. However, OpenAI is afraid to release it after a survey revealed that about a third of ChatGPT users would stop using the chatbot if the anti-cheating measures were implemented.

The report comes from The Wall Street Journal, which saw documents describing the tool. One of these people told the paper that using it is just a matter of "pressing a button."

The tool is very effective. It's capable of detecting 99.9% of ChatGPT-generated text. The tool would watermark the text in a way that humans would not be able to discern:

ChatGPT is powered by an AI system that predicts what word or word fragment, known as a token, should come next in a sentence. The anti-cheating tool under discussion at OpenAI would slightly change how the tokens are selected. Those changes would leave a pattern called a watermark.

The report explains that OpenAI commissioned a study in April 2023 that showed worldwide support for a tool that would detect ChatGPT text. Four people in the survey wanted such a tool for each person who didn't.

However, a different OpenAI study from the same month showed that 69% of ChatGPT users found that cheating detection tech would lead to false accusations. More importantly, 30% of the respondents said they would use ChatGPT less if it deployed a watermark system that rival AI chatbots didn't have.

Since then, OpenAI staff have debated the merits of making an anti-cheat tool available to the public. It's not just about growing the ChatGPT user base.

Separately, OpenAI found that the anti-cheat tool would not impact the quality of ChatGPT text generation. That could have been a reason to avoid releasing the tool to the public.

One challenge is determining who has access to the tool, per the WSJ. If too many people have it, bad actors would figure out the watermarking technique. Then, it would be largely useless. One proposal is to make the tool available to educators or companies that can help schools identify AI-written content.

OpenAI told The Journal that its ChatGPT anti-cheat tool would affect some groups of people, like non-native English speakers. That's a point OpenAI makes in an update to a May blog post about watermarking images generated with its AI models. The update came after the WSJ report.

OpenAI explains that its watermarking tool can be easily defended against, offering examples of how you could disable it:

While it has been highly accurate and even effective against localized tampering, such as paraphrasing, it is less robust against globalized tampering; like using translation systems, rewording with another generative model, or asking the model to insert a special character in between every word and then deleting that character – making it trivial to circumvention by bad actors.

The company says it's working on developing a text metadata watermarking tool for ChatGPT-generated text:

For example, unlike watermarking, metadata is cryptographically signed, which means that there are no false positives. We expect this will be increasingly important as the volume of generated text increases. While text watermarking has a low false positive rate, applying it to large volumes of text would lead to a large number of total false positives.

Still, there's no telling when OpenAI will release such a tool. Meanwhile, Google has a watermarking tool that can detect text written with Gemini AI. It's called SynthID, but it's not widely available. After all, Google just told children it's okay to use Gemini AI to craft that perfect letter with the Olympics ad everybody hates. Google pulled that commercial following all the backlash.

These companies are also working on labeling visual AI-generated content. The updated OpenAI blog post above focuses on watermarking images. That's understandable, as AI-generated photos can be used for misleading purposes. But, hopefully, all genAI chatbots out there will soon make such watermarking techniques standard for text too.

OpenAI Has A Tool That Can Tell If You Use ChatGPT To Cheat, But It Won't Release It

Recommended