Amazon Olympus AI vs Claude 2 – LLM comparison

Amazon vs Anthropic - How will they compare?

Andy Jassy's Amazon Olympus AI compared to Anthropic's Claude-2 LLM Large language model.

You can trust PC GuideOur team of experts use a combination of independent consumer research, in-depth testing where appropriate – which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

Last Updated on

The e-commerce giant is joining the AI race. While Amazon Olympus is not the tech giant’s first foray into artificial intelligence, with the Amazon Titan “high-performing foundation model” added to AWS earlier this year, it is the firms most powerful. Rumor has it that former Head of Alexa, Rohit Prasad, is training a new large language model (LLM), that will have “2 trillion” parameters – more than OpenAI’s GPT-4! How can Anthropic’s generative AI model compare in the battle of Amazon Olympus AI vs Claude 2?

Amazon Olympus AI vs Claude 2

It may come as a surprise to some that these two prompt-based powerhouses are in the same ballpark. Both powerful AI systems in their own right, Claude-2 comes in 2nd place against all currently public generative AI chatbots. Anthropic’s tech is mostly spoken about in terms of AI safety, as if their chatbot is nothing more than an ethics research experiment. The claim is understandable, considering that the firm itself refers to its AI system as a Constitutional AI (CAI), which “shapes the outputs of AI systems according to a set of principles, with the goal of making a helpful, harmless, and honest AI assistant.“

As one of the founding members of the Frontier Models Forum, the firm is pro-active with safety research, especially concerning AGI, or Artificial General Intelligence. Something of a technical advisory board to lawmakers, the firm is helping shape legislation such as that outlined in US President Biden AI Executive Order, and discussed at UK Prime Minister Rishi Sunak’s inaugural AI Safey Summit at Bletchley Park. However, in terms of raw power, accuracy, and comprehension, Claude-2 is actually ranked 2nd among all the AI chatbot models of big tech!

CompanyCEOAI ChatbotLLMAPIOpen-source
xAIElon MuskGrokGrok-1NoNo
OpenAISam AltmanChatGPTGPT-3.5, GPT-4, GPT-4V, or GPT-4 TurboYesNo
GoogleSundar PichaiBardPaLM 2YesNo
MicrosoftSatay NadellaBing ChatGPT-4NoNo
MetaMark ZuckerbergMeta AILLaMA 2NoYes
AnthropicDario AmodeiClaudeClaude-2YesNo
AmazonAndy JassyOlympus (rumored)Olympus (rumored)NoNo
The AI chat bots of big tech.

Essential AI Tools

Editor’s pick
Only $0.00019 per word!

Content Guardian – AI Content Checker – One-click, Eight Checks

8 Market leading AI Content Checkers in ONE click. The only 8-in-1 AI content detector platform in the world. We integrate with leading AI content detectors to give unparalleled confidence that your content appear to be written by a human.
EXCLUSIVE DEAL 10,000 free bonus credits

Jasper AI

On-brand AI content wherever you create. 100,000+ customers creating real content with Jasper. One AI tool, all the best models.
TRY FOR FREE

WordAI

10x Your Content Output With AI. Key features – No duplicate content, full control, in built AI content checker. Free trial available.
TRY FOR FREE

Copy.ai

Experience the full power of an AI content generator that delivers premium results in seconds. 8 million users enjoy writing blogs 10x faster, effortlessly creating higher converting social media posts or writing more engaging emails. Sign up for a free trial.
TRY FOR FREE

Writesonic

Create SEO-optimized and plagiarism-free content for your blogs, ads, emails, and website 10X faster. Start for free. No credit card required.

Amazon vs Anthropic – LLM’s compared

We have an objective comparison of all popular AI chatbots and their respective large language models. Conducted by xAI, the artificial intelligence firm founded by Elon Musk, the following are the results of benchmarking all well-known NLP (Natural Language Processing) models in the AI arms race.

The leading foundation models of big tech were tested across four benchmarks – GSM8k, MMLU, HumanEval, and MATH. Included in this comprehensive research were OpenAI’s GPT-4 model, Anthropic’s Claude-2, Google’s PaLM 2, xAI’s Grok-1, OpenAI’s GPT-3.5, Pi’s Inflection-1, Meta’s LLaMA 2, and xAI’s Grok-0, from most to least powerful and/or accurate. Unsurprisingly, OpenAI’s ChatGPT model GPT-4 is in the lead (identical in NLP capabilities to the GPT-4V variant).

BenchmarkGrok-0LLaMa 2Inflection-1GPT-3.5Grok-1PaLM 2Claude-2GPT-4
GSM8k56.8%56.8%62.9%57.1%62.9%80.7%88%92%
MMLU65.7%68.9%72.7%70.0%73.0%78%75%86.4%
HumanEval39.7%29.9%35.4%48.1%63.2%N/A70%67%
MATH15.7%13.5%16.0%23.5%23.9%34.6%N/A42.5%
The large language models of big tech, as benchmarked by xAI.

If you don’t recognise the model names above, they power AI chatbots like OpenAI’s ChatGPT, Microsoft’s Bing Chat, and Google Bard. Despite its reputation for safety, that’s Claude-2 in 2nd place! As it turns out, you can be both the tortoise and the hare.

You’ll notice Amazon Olympus AI is not even on this list. That’s because it isn’t yet released. In fact, it doesn’t even have a release date; Amazon SVP and Chief Scientist of Artificial General Intelligence, Rohit Prasad, has been quite tight-lipped about the details, but rumor has it that a release date will be announced this December. When it does finally roll out, Olympus LLM will be integrated into Alexa AI, while Amazon Titan remains available to businesses via AWS (Amazon Web Services). We most likely won’t get to try out the Titan successor until 2024.

Steve is the AI Content Writer for PC Guide, writing about all things artificial intelligence. He currently leads the AI reviews on the website.