ChatGPT can now see, hear, and speak! Users can add images directly into the visual ChatGPT interface. OpenAI’s powerful large language model can generate text-based conversations, including a description of image features. In addition to the NLP (Natural Language Processing) abilities we know and love, the new feature (GPT-4 image input) brings with it image recognition capabilities. Prompting with, and analyzing, a ChatGPT image alongside textual input meaning uploading an image to ChatGPT is easier than ever, with these simple steps.
Quick answer
To access ChatGPT’s new image upload feature, you’ll have to be on a paid subcription to chatgpt. Once you’re signed up and have located the image icon required to upload, you’re ready to go.
How to upload an image to ChatGPT
OpenAI announced via a tweet on March 14th, 2023, that the OpenAI model added “visual input”. This meant that the neural network known as “GPT-4 can accept images as inputs and generate captions, classifications, and analyses”. This was initially limited to the GPT-4 LLM via the OpenAI API – but this is now possible through the ChatGPT interface in a web browser or mobile app. Here’s how:
Step
Open ChatGPT
Open ChatGPT either via a web browser (for desktop or mobile) or via the ChatGPT app (for Android and iOS).
Set up your subscription
Subscribe to ChatGPT Plus ($20/month), ChatGPT Team ($25/month x 2 users minimum) or ChatGPT Enterprise if you aren’t already.
Step
Select the camera icon
Tap either the camera icon (to take a new photo) or the image icon (to select one from your camera roll) if you’re using the mobile app. Or if you’re using the web browser version on a desktop, select the Paperclip icon found in the bottom left corner of the keyboard.
Step
Select image
After selecting the image/paperclip icon your image folder should appear, you can then simply select the image you’d like to input and it will be uploaded to your chatbot.
Step
Type your prompt
Type your text prompt alongside the loaded image.
Step
Press enter
Hit enter, and ChatGPT will process your image and associated text prompt accordingly!
The free version of ChatGPT (GPT-3.5) can’t accept image prompts. To add images to ChatGPT (image prompting), you’ll need to subscribe to one of the three paid subscriptions — ChatGPT Plus, ChatGPT Team, or ChatGPT Enterprise. The GPT-4 API can also be used as an image generation tool.
Essential AI Tools
Content Guardian – AI Content Checker – One-click, Eight Checks
Originality AI detector
Jasper AI
WordAI
Copy.ai
How to upload an image to ChatGPT with plugins
If the picture input feature is not working for whatever reason, ChatGPT plugins still provide a workaround:
Step
Open ChatGPT
Open ChatGPT either via a web browser (for desktop or mobile) or via the ChatGPT app (for Android and iOS).
Step
Set up ChatGPT subscription
Ensure you’re subscribed to ChatGPT Plus ($20/month) or ChatGPT Enterprise.
Step
Install plugins
Install the Image Converter and/or SceneXplain plugins.
Step
Provide a direct link to image
The ChatGPT interface does not allow you to add an image directly. Instead, upload your image to a service that provides a direct link to that image. This does not include Google Drive.
Step
Input link in text prompt
The direct link to an image will end with a file extension (in this case, .jpg or .png). Use the link to that image in your text prompt.
Step
Ask ChatGPT about the image!
Input questions about the image in the text prompt.
Depending on how closely linked your question is with image recognition, you’ll receive an answer of some degree of accuracy. For example, prompting an image of a banana and asking “What fruit is this?” Is much more likely to receive an accurate answer than asking “How many calories are in this fruit?”.
Using ChatGPT to generate images
If you’re limited to GPT-3.5, your best option is to describe the image in words and let ChatGPT generate a text-based conversation around that description. For example, if you want to share a picture of a cat with someone in ChatGPT, you could describe the cat in words such as “a small, fluffy cat with green eyes and a white belly.”
You can then copy and paste the results into a text-to-image generation tool like MIdjourney, DeepAI, or DALL-E. If that’s not an option, then try an alternative large language model (LLM) with multimodal capabilities such as Microsoft Copilot or Google Bard.
These tools use advanced machine learning algorithms to generate images based on textual descriptions, which can be a great way to bring your ChatGPT conversations to life.
While the user interface of the Open AI Chat GPT AI chatbot may resemble that of AI art generators like Midjourney, the tech under the hood is not the same. The former uses a large language model (LLM) and the latter combines that with a diffusion model. It does this by using a sidecar model trained on visual training data to recognize the content of an image. The processing that goes into this is pretty dense and scientific. In short, it just has a lot of examples of common visual stimuli. Objects, animals, and places all count as stimuli, and ChatGPT recognizes the similarities between them.
What can I use the ChatGPT image input feature for?
Image analysis: Ask AI about objects in images, analyze documents, or explore visual content. Remember, images can be screenshots of a text-based document – ChatGPT can read images and, it can also create images. Then “add more images in later turns to deepen or shift the discussion. Return anytime with new photos.”
Image annotation: Draw attention to specific areas by circling or drawing arrows with a photo edit markup tool on your image before uploading. “This guides ChatGPT to focus on elements you deem important.”
What’s the future of ChatGPT image input?
OpenAI co-founder Sam Altman is hopeful about the future of image recognition, and generation with ChatGPT.
The official GPT-4 page reads “Following the research path from GPT, GPT-2, and GPT-3, our deep learning approach leverages more data and more computation to create increasingly sophisticated and capable language models”. Considering the undeniable pressure of competition in multimodality, we can expect to see text-to-image, text-to-audio, image-to-audio, and perhaps even text-to-video capabilities in the coming months and years. However, it’s important to know ChatGPT’s limitations and work within them to get the results.
How to upload an image to ChatGPT – final words
Image uploading prompting is about to become easier than ever. Plugins are no longer necessary to add an image to ChatGPT. In addition, with the introduction of DALL-E 3, you can now output images from ChatGPT too! This functionality is limited to paid accounts, and we expect this to continue to be the case with the eventual release of GPT-5.