On January 31st, OpenAI announced AI Classifier via blog post, their new software tool “trained to distinguish between AI-written and human-written text.” Less than 6 months later, that same post begins with the addendum “As of July 20, 2023, the AI classifier is no longer available due to its low rate of accuracy.” – What went wrong?
What is AI Classifier?
Since the dawn of people being lazy (thought to coincide with the dawn of time), we have sought out easier ways to do things. Horses were replaced by cars. Rubbing two sticks together really fast was replaced by the lighter. Humans were replaced by artificial intelligence.
Essential AI Tools
That last one, of course, is not (yet) true. However, a few intrepid writers have of course employed AI in the age old struggle to do less work and yield the same result – but does it yield the same result? Employers who don’t believe so have responded to this initiative (or lack thereof?) with AI text classifiers – software tools that claim to identify AI-generated text. Try reading “What is ChatGPT – and what is it used for?” or “How to use ChatGPT on mobile” for more information.
AI Classifier is one such tool. The OpenAI text classifier was up against a particularly challenging problem. How to tell if a word was put there by a machine or by a human, when the end result is digitally identical.
Can AI detect AI written text?
Well, the length of the input text has something to do with it – on it’s own, a single word is inscrutable, but an entire paragraph paints a picture. The accuracy, even relevancy, of what you write, and the tone with which you put your sentences together factors into how these AI detection tool’s work. However, marred by false positives and unfair evaluations, the classifier’s reliability was soon called into question. Surely the world’s best-funded artificial intelligence company – powered by the fine-tuned GPT model, everyone’s favourite large language model – was up to the task?
Apparently not. The statistics show that the “classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).”
On a dataset from a variety of sources (including InstructGPT), the detector was unable to correctly identify AI-written text most of the time. Perhaps worse, it could be used to get good people in trouble with false claims that they were using an AI Chatbot like ChatGPT. More effective provenance techniques are required before this technology can be reliably used in any disciplinary sense. These techniques will develop over the coming months, whether at OpenAI or perhaps elsewhere. Mitigations to misinformation campaigns remain an important frontier, and the mechanisms we build to confront that serve to benefit both creatives and employers alike. OpenAI themselves have also been tackling other forms of media, to detect whether audio or visual content is AI-generated, which presents it’s own set of challenges.
AI Classifier shuts down 6 months after launch
In the original blog post on OpenAI.com, OpenAI states that “While this resource is focused on educators, we expect our classifier and associated classifier tools to have an impact on journalists, mis/dis-information researchers, and other groups. We are engaging with educators in the United States to learn what they are seeing in their classrooms and to discuss ChatGPT’s capabilities and limitations, and we will continue to broaden our outreach as we learn.”
In light of the failure to produce an accurate text classifier, it goes on to concede that the “classifier has a number of important limitations. It should not be used as a primary decision-making tool, but instead as a complement to other methods of determining the source of a piece of text. The classifier is very unreliable on short texts (below 1,000 characters). Even longer texts are sometimes incorrectly labeled by the classifier. Sometimes human-written text will be incorrectly but confidently labeled as AI-written by our classifier. We recommend using the classifier only for English text”, which represents a damning blow to our hopes for the prevention of plagiarism and general academic dishonesty. However, more recent AI systems have shown great promise, and a new classifier will take it’s place.
We don’t see this as a conclusion. Where there’s a will, there’s a way, and defending our creative professionals is a worthwhile endeavour. Good classifiers can exist, and we cannot give up this fight, for the sake of all the truly talented artists who do not deserve to be ‘replaced’ by AI.