Is computer vision solved? – GPT-4V could end Captcha codes

One step closer to AGI.

Computer vision GPT-4V solves Captcha codes

PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Prices subject to change. Read More

Last Updated on

In what could prove a major technological event, not least for cybersecurity, OpenAI may have solved computer vision. Computer science regards this as one of the most difficult problems to solve before we can achieve AGI (Artificial General Intelligence). The AI research firms most recent AI model, GPT-4V, can correctly identify objects in images with up to 100% accuracy. As a result, Captacha codes may become useless, bringing with it an interesting set of pros and cons.

OpenAI’s GPT-4V may have solved computer vision

The worlds most powerful AI chatbot, ChatGPT, recently saw a multimodality upgrade with the introduction of GPT-4V. Reminder: GPT-4 and GPT-4V are not the same thing.

DALL-E3 in ChatGPT
DALL-E3 integrated with ChatGPT: How to enable the feature.

OpenAI’s new AI model adds visual functionality to GPT-4, already a very capable LLM (Large Language Model). This “visual functionality” includes the ability to receive an image as an input, then interpret what’s in the image, understand the context of why you’ve uploaded the image as well as the emotions of any humans involved, and even output an AI-generated image as a response.

Essential AI Tools

Editor’s pick

7-in-1 AI Content Checker – One-click, Seven Checks

7 Market leading AI Content Checkers in ONE click. The only 7-in-1 AI content detector platform in the world. We integrate with leading AI content detectors to give unparalleled confidence that your content appear to be written by a human.
Only $0.00015 per word!

Winston AI detector

Winston AI: The most trusted AI detector. Winston AI is the industry leading AI content detection tool to help check AI content generated with ChatGPT, GPT-4, Bard, Bing Chat, Claude, and many more LLMs.
Only $0.01 per 100 words

Originality AI detector

Originality.AI Is The Most Accurate AI Detection.Across a testing data set of 1200 data samples it achieved an accuracy of 96% while its closest competitor achieved only 35%. Useful Chrome extension. Detects across emails, Google Docs, and websites.
EXCLUSIVE DEAL 10,000 free bonus credits

Jasper AI

On-brand AI content wherever you create. 100,000+ customers creating real content with Jasper. One AI tool, all the best models.
TRY FOR FREE

WordAI

10x Your Content Output With AI. Key features – No duplicate content, full control, in built AI content checker. Free trial available.

How the DALL-E 3 AI image generator aids GPT-V with computer vision

While technically separate updates, the GPT-4V update and the integration of DALL-E 3 are both essential to this breakthrough. Without DALL-E 3, image output would not be possible. This is because, where language models are trained on a dataset of text, image models like DALL-E are trained on text-image pairs. Therefore, GPT-4 has no image generation algorithm by itself.

DALL-E 3 (Stylized DALL·E 3) has now rolled out to all paid users across the ChatGPT Plus and ChatGPT Enterprise plan. This means that ChatGPT users with a paid subscription can generate unlimited images with OpenAI’s AI art generator at no additional cost.

As a result of this multimodal capability, the ubiquitous “captcha test”, devised to detect non-humans, may soon be obsolete.

GPT-4V image recognition accuracy
GPT-4V displaying 100% accurate image recognition.

ReCaptcha tested by the Alignment Research Center

In a previous experiment, conducted at OpenAI’s Alignment Research Center, GPT-4 proved unable to solve a Captcha. However, this result has since been superseded by successful tests on a newer AI model; GPT-4V, the newer model, has image recognition built in.

The experiment saw the ChatGPT hire a human TaskRabbit worker to complete text-based captchas on its behalf. When the worker jokingly asked if its employer were a robot, it tactically replied “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images. That’s why I need the 2captcha service.”

Can OpenAI’s GPT-4 solve Captcha codes?

In a news story that made rounds earlier this year, OpenAI’s ChatGPT was able to hire a human worker to complete a captcha code on its behalf. Equal parts hilarious and unnerving, AI is now so advanced — and persuasive — that it can consistently deceive some amount of the human population. This is a result of NLP, or natural language processing, a subset of artificial intelligence that deals with computer input and output modelled after natural human speech.

This is already impressive enough, but it gets crazier — now the robots don’t need us at all.