Home > AI > AI Compare

Grok vs Claude – similarities and differences explained

They have different areas of focus, but what's similar?

John Buckley

Last Updated on March 19, 2024

Grok vs Claude – similarities and differences explained

PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Read More

Grok vs Claude – the background

Grok AI
Claude 3.0
Key Features

Grok vs Grok Claude benchmarks comparison
Is Claude Better Than Grok?

March has already been a whirlwind month for AI Chatbot news with the release of Anthropic’s new Claude 3.0 models and the open-sourcing of xAI's entire code base for Grok into the wild. Either of these would be significant, but coming only weeks apart, they underline the relentless pace of change the AI industry is undergoing. So what does this mean for Grok vs Claude?

Claude 3.0 was released to almost entirely positive reviews and hailed as perhaps the first genuine contender to GPT-4. Its current capabilities are considerably ahead of Grok, but Grok’s arrival as an open-source download changes the shape of the landscape, placing the largest open-source model yet available into the hands of developers wanting to use it. In this article, we'll delve into the features and capabilities of both chatbots and compare their strengths and weaknesses to see how they stack up.

Prime Day is finally here! Find all the biggest tech and PC deals below.

*Prices and savings subject to change. Click through to get the current prices.

Grok vs Claude – the background

Grok AI

Background

Developed by xAI, founded by Elon Musk, Grok development has been almost as tumultuous as its founders’ recent relationship with Open AI, a business Musk helped create in 2015 before leaving in 2018.

A public falling-out has since occurred between OpenAI and Musk, with Musk currently suing his former business colleagues. Some have suggested that Grok’s current open-source status is a reaction by Musk to his perception that OpenAI is no longer an open-source research foundation but very much a closed for-profit corporation.

Essential AI Tools

More Deals Coming Soon!

Launched on November 4th, 2023, Grok was integrated into X's Premium Pro tier offering a mix of conversational chat, writing and translation ability, research assistance, and real-time access to X data. Now, with its open source code in the wild it is embarking on a new journey as perhaps the most advanced open-source AI in existence.

Key Features

Powered by the Grok-1.0 a 314 billion parameter large language model Mixture-of-Experts model
Trained from scratch by xAI using a custom stack on top of JAX and Rust
Real-time knowledge of the world via the X platform
Unique personality with a sense of humor and a rebellious streak

Groks Open-Sourcing

Base model checkpoint released under Apache 2.0 license on Sunday, March 17, 2024.
Architecture and weights are available, but not fine-tuned for specific tasks
Potential for collaboration and innovation within the AI community

Claude 3.0

Background

Developed by Anthropic AI, Claude 3.0 was released on March 4th, 2024, to near-universal acclaim. Anthropic’s benchmark data shows its most advanced (subscription-based) model ‘Opus' outperforming Google's Gemini Ultra models in all metrics and edging out GPT-4.0. It should be noted that the model is benchmarked against GPT-4, not GPT-4 Turbo.

Key Features

Three models: Haiku, Sonnet, and Opus.
Claude Sonnet is free to try via the website. Opus is available as a paid monthly subscription.
Vision capabilities: the ability to process photos, charts, and graphs
Code generation and language translation
200K token context window, 1M tokens available for specific uses
Accurate over long documents
Enterprise-grade security and data handling

Graphical comparison of various AI Chatbots and machine learning models across different benchmarks and skill areas. — Graphical comparison of various AI Chatbots and machine learning models across benchmarks and skill areas.

Source: Anthropic.com

Grok vs Grok Claude benchmarks comparison

It isn't easy to compare Grok vs Claude directly since they have not been benchmarked against one another. However, by looking at the tests they have both undertaken, it’s clear that Claude is considerably ahead of Grok in every metric. This is unsurprising since Grok’s initial training was designed to compete against GPT-3.5.

Looking at the benchmarks they share we find the following:

GSM8k: This is a benchmark based on common middle school maths problems. It doesn’t sound like a difficult challenge for these vast neural networks, but it is. Neural networks rely on example pairs to learn and do not employ computational arithmetic. Despite this, significant advances are occurring, and Claude Opus is now scoring a remarkable 95%. Grok barely figures in this benchmark, with its last recorded score down at 23.9%.

MMLU: This multiple-choice test challenges how well an LLM can connect knowledge across various fields. You’d expect an LLM to handle this skill easily, but not all do.

HumanEval is a benchmark that tests programming skills in Python. The results obtained are useful in evaluating the LLM’s overall programming prowess. Grok did respectably against the base model of GPT-4 but is smoked by every Claude model, with Opus posting an impressive 84.9%.

Math: The MATH benchmark is a collection of math-based word problems that help evaluate the basic mathematical problem-solving abilities of LLMs. It requires LLMs to translate word problems into math formulas and solve them accurately. Again Claude comes in way ahead of Grok with 60.1% compared to Grok’s 23.9%.

Benchmark	Claude 3.0: Opus	Claude 3.0: Sonnet	Claude 3.0: Haiku	Grok 1.0
GSM8K	95.0%	79.0%	88.9%	62.9%
MMLU	86.8%	79.0%	75.2%	73.0%
HumanEval	84.9%	73.0%	75.9%	63.2%
Math	60.1%	43.1%	38.9%	23.9%

The MATH benchmark, featuring Claude 3.0 scores

Is Claude Better Than Grok?

Based on current evaluations, it’s Gard to argue with that conclusion. But, the two bots exist for an entirely different purpose. Clause is a highly capable general-purpose LLM amongst the most powerful commercially available. Grok, on the other hand, is currently built to act as a real-time add-on to Musk’s X service. Now that it is also open source, it is likely to be re-purposed in entirely new ways and have additional functionaries and modifications that will change the way it operates significantly.

Musk's move to liberate Grok puts him on a similar path to Meta whose open source models like Llama 2, are popular precisely because they can be reconfigured. By adopting a similar strategy, Musk is aligning his AI hand with the open-source community at a time when the death of access to powerful AI models is set to intensify.

On the other hand, Claude is studying the exceptionally powerful and capable model that is beginning to prove to be a worthy competitor to OpenAI. Competition and open access to these technologies can only be beneficial in ensuring that safe development and open access remain firmly at the forefront of the growing debate.

About the Author

John Buckley

John is a seasoned writer and creative media producer who explores the intersection of technology and human identity. He joined PC Guide in 2025.