Home > News

Nvidia teams up with DeepSeek for R1 optimizations on Blackwell, boosting revenue by 25x

Why fight them when you can join them?

Hassam Nasir

Last Updated on February 25, 2025

Nvidia teams up with DeepSeek for R1 optimizations on Blackwell, boosting revenue by 25x

PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Read More

AI inference is now significantly cheaper and faster
Blackwell B200 crushes H100 with over 25x faster AI processing

While Western and even domestic companies like OpenAI and Alibaba (respectively) are trying to take down DeepSeek by pushing their AI models into high gear, Nvidia is one company that sees this breakthrough as a positive. We saw this when DeepSeek was first released, with an Nvidia spokesperson calling it “an excellent AI advancement” and stating that they don't see it as a negative because “inference requires significant numbers of NVIDIA GPUs.”

Nvidia is sticking to its stance on DeepSeek, as the company has announced a partnership with DeepSeek to bring DeepSeek R1 optimizations for their Blackwell architecture. This appears to deliver significant leaps in AI inference performance and is expected to boost revenue levels far more than before.

Prime Day is finally here! Find all the biggest tech and PC deals below.

*Prices and savings subject to change. Click through to get the current prices.

AI inference is now significantly cheaper and faster

NVIDIA AI Developer announced the news on X, claiming that these new optimizations deliver 25 times more revenue at 20 times lower cost per token compared to the H100 GPU just four weeks ago. To get a rough idea of the revenue boost, let’s say your AI system normally generates $100,000 in revenue and costs $50,000 to run, leaving you with a net profit of $50,000. With a 25x revenue boost, that number jumps to $2,500,000, and with 20x lower costs, your running expenses drop from $50,000 to just $2,500, bringing your potential total profit to $2,497,500.

https://twitter.com/NVIDIAAIDev/status/1894172956726890623

A key part of this boost comes from FP4 precision, which allows B200 GPUs to process more data with less power, making AI inference significantly cheaper and faster. While the accuracy isn't as high as with higher-bit formats like FP8 or FP16, NVIDIA claims the optimized model still achieves 99.8% of FP8's accuracy in benchmark tests, meaning it performs nearly as well while being far more efficient.

Blackwell B200 crushes H100 with over 25x faster AI processing

Nvidia also shared a performance chart to show the impact of these improvements where Blackwell-based B200 GPUs were able to process 21,088 tokens per second. This might not sound like a big deal but when you compare this from 5,899 tokens per second on the H200 (February 2025) and just 844 tokens per second on the H100 (January 2025), it shows just how big of a leap this is.

For developers eager to try it out, NVIDIA has released an FP4-optimized DeepSeek checkpoint on Hugging Face, giving early access to these efficiency gains. DeepSeek's more open approach has been a key highlight among experts, and with Nvidia pushing things even further, it shows how impactful open models can be.

About the Author

Hassam Nasir

Hassam boasts over seven years of professional experience as a dedicated PC hardware reviewer and writer.