The Wall Street Journal reported that OpenAI built a ChatGPT writing detector to catch students who cheat on assignments. However, the company is deliberating whether to release it.
An OpenAI spokesperson confirmed to TechCrunch that the company is developing the reported tool, but it’s taking a “deliberate approach” first.
The AI detection tool allegedly has high accuracy, but it has flaws that are susceptible to hackers and may discriminate against non-English speakers.
How does the ChatGPT writing detector work?
The Verge shared more details regarding the tool’s capabilities. For example, OpenAI says it is “99.9% effective” and resists “tampering, such as paraphrasing.”
The ChatGPT writing detector relies on text watermarking, which involves making small changes to how ChatGPT selects words.
READ: OpenAI designs special version of ChatGPT for universities
Consequently, ChatGPT itself leaves invisible signs of AI-generated writing that the detector can identify.
However, it’s vulnerable to “globalized tampering like using translation systems, rewording with another generative model…”
“… or asking the model to insert a special character in between every word and then deleting that character.”
These are already common methods of circumventing AI detection tools. Learn more about them in this other Inquirer Tech article.
“Another important risk we are weighing is that our research suggests the text watermarking method has the potential to disproportionately impact some groups,” OpenAI said in a statement.
READ: OpenAI introduces AI model that turns text into video
“For example, it could stigmatize the use of AI as a useful writing tool for non-native English speakers,” it added.
Besides the ChatGPT writing detector, OpenAI is developing other methods such as text metadata. Unlike watermarking, it relies on cryptographic signatures, removing the risk of false positives.
Watermarking has a low false positive rate for minimal text. However, the false positive chance increases as it checks larger volumes of text.
“We are still in the early stages of exploration, so it is too early to gauge how effective the approach will be, but there are characteristics of metadata that would make this approach particularly promising,” OpenAI wrote.