Home > AI

Is computer vision solved? – GPT-4V could end Captcha codes

One step closer to AGI.

Reviewed By: Kevin Pocock

Last Updated on December 5, 2023
Computer vision GPT-4V solves Captcha codes
You can trust PC Guide: Our team of experts use a combination of independent consumer research, in-depth testing where appropriate - which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

In what could prove a major technological event, not least for cybersecurity, OpenAI may have solved computer vision. Computer science regards this as one of the most difficult problems to solve before we can achieve AGI (Artificial General Intelligence). The AI research firms most recent AI model, GPT-4V, can correctly identify objects in images with up to 100% accuracy. As a result, Captacha codes may become useless, bringing with it an interesting set of pros and cons.

OpenAI’s GPT-4V may have solved computer vision

The worlds most powerful AI chatbot, ChatGPT, recently saw a multimodality upgrade with the introduction of GPT-4V. Reminder: GPT-4 and GPT-4V are not the same thing.

DALL-E3 in ChatGPT
DALL-E3 integrated with ChatGPT: How to enable the feature.

OpenAI’s new AI model adds visual functionality to GPT-4, already a very capable LLM (Large Language Model). This “visual functionality” includes the ability to receive an image as an input, then interpret what’s in the image, understand the context of why you’ve uploaded the image as well as the emotions of any humans involved, and even output an AI-generated image as a response.

Essential AI Tools

Editor’s pick
Only $0.00019 per word!

Content Guardian – AI Content Checker – One-click, Eight Checks

8 Market leading AI Content Checkers in ONE click. The only 8-in-1 AI content detector platform in the world. We integrate with leading AI content detectors to give unparalleled confidence that your content appear to be written by a human.
Only $0.01 per 100 words

Originality AI detector

Originality.AI Is The Most Accurate AI Detection.Across a testing data set of 1200 data samples it achieved an accuracy of 96% while its closest competitor achieved only 35%. Useful Chrome extension. Detects across emails, Google Docs, and websites.
EXCLUSIVE DEAL 10,000 free bonus credits

Jasper AI

On-brand AI content wherever you create. 100,000+ customers creating real content with Jasper. One AI tool, all the best models.
TRY FOR FREE

WordAI

10x Your Content Output With AI. Key features – No duplicate content, full control, in built AI content checker. Free trial available.
TRY FOR FREE

Copy.ai

Experience the full power of an AI content generator that delivers premium results in seconds. 8 million users enjoy writing blogs 10x faster, effortlessly creating higher converting social media posts or writing more engaging emails. Sign up for a free trial.

How the DALL-E 3 AI image generator aids GPT-V with computer vision

While technically separate updates, the GPT-4V update and the integration of DALL-E 3 are both essential to this breakthrough. Without DALL-E 3, image output would not be possible. This is because, where language models are trained on a dataset of text, image models like DALL-E are trained on text-image pairs. Therefore, GPT-4 has no image generation algorithm by itself.

DALL-E 3 (Stylized DALL·E 3) has now rolled out to all paid users across the ChatGPT Plus and ChatGPT Enterprise plan. This means that ChatGPT users with a paid subscription can generate unlimited images with OpenAI’s AI art generator at no additional cost.

As a result of this multimodal capability, the ubiquitous “captcha test”, devised to detect non-humans, may soon be obsolete.

GPT-4V image recognition accuracy
GPT-4V displaying 100% accurate image recognition.

ReCaptcha tested by the Alignment Research Center

In a previous experiment, conducted at OpenAI’s Alignment Research Center, GPT-4 proved unable to solve a Captcha. However, this result has since been superseded by successful tests on a newer AI model; GPT-4V, the newer model, has image recognition built in.

The experiment saw the ChatGPT hire a human TaskRabbit worker to complete text-based captchas on its behalf. When the worker jokingly asked if its employer were a robot, it tactically replied “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images. That’s why I need the 2captcha service.”

Can OpenAI’s GPT-4 solve Captcha codes?

In a news story that made rounds earlier this year, OpenAI’s ChatGPT was able to hire a human worker to complete a captcha code on its behalf. Equal parts hilarious and unnerving, AI is now so advanced — and persuasive — that it can consistently deceive some amount of the human population. This is a result of NLP, or natural language processing, a subset of artificial intelligence that deals with computer input and output modelled after natural human speech.

This is already impressive enough, but it gets crazier — now the robots don’t need us at all.

Steve is an AI Content Writer for PC Guide, writing about all things artificial intelligence. He currently leads the AI reviews on the website.