Home > News

DeepSeek is even more efficient than Nvidia, says analyst, and the industry could copy them

But it lacks a fluid user experience
Last Updated on
DeepSeek is even more efficient than Nvidia, says analyst, and the industry could copy them
PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Read More

When DeepSeek first launched, it made a big impact in the AI market, largely due to its low computational requirements. But even more impressive was the fact that, despite needing so little power, it managed to outperform AI models from tech giants like OpenAI. Fast forward to today, and we are still uncovering just how efficient DeepSeek really is and whether this efficiency comes with trade-offs or if DeepSeek has simply cracked the code.

These questions stem from a recent analysis highlighting that DeepSeek serves tens of millions of daily active users (DAU) with just 2,000 GPUs. This is an astonishing feat compared to competitors like OpenAI and xAI, which rely on vastly larger GPU clusters. For instance, xAI’s latest Grok 3 AI model is powered by Colossus, a supercomputer equipped with 200,000 Nvidia GPUs.

Nvidia follows DeepSeek’s optimization methods

According to the analysis, DeepSeek’s efficiency shows that a single H20 node (8 GPUs) can serve about 600 users. This means that while a service like WeChat would traditionally require around 400,000 GPUs to support 40 million concurrent users at 20 TPS per user, DeepSeek’s optimizations reduce this need to around 100,000–200,000 GPUs by operating at 10 TPS per user.

“But DeepSeek has had even fewer GPUs from the very beginning, and they even had to use downgraded GPUs like A800/H20. However, they can squeeze the performance of the existing GPUs to the extreme, and their optimizations are even more effective than the official optimizations provided by NVIDIA.”

Source: Wukong, Substack

The report notes that “DeepSeek’s underlying infrastructure optimization capabilities are the most underestimated. And it can be copied by the industry.”

On top of that, unlike major tech companies that scale with high-end GPUs, the research reveals that DeepSeek has relied on downgraded GPUs like A800 and H20 from the start. Yet, despite this constraint, it has pushed hardware performance to the extreme, surpassing even NVIDIA's own optimizations. As a result, NVIDIA engineers have shared that the company is now working to integrate DeepSeek's optimization methods.

But unfortunately, there’s a tradeoff

Now, since DeepSeek reportedly serves tens of millions of DAUs with just 2,000 GPUs, a fraction of what other AI services require, this suggests that DeepSeek prioritizes efficiency over user experience. Unlike mainstream AI chatbots, which allocate more computing resources for lower latency and faster responses, DeepSeek users often have to wait longer for replies.

That said, DeepSeek’s success proves that better software optimization can achieve similar results with far fewer resources, unlike most large companies that focus on expanding GPU clusters. If more companies follow this approach, the AI industry could shift toward lower costs, greater accessibility, and broader adoption. However, the Jevons Paradox suggests that as computing power becomes cheaper, demand for AI applications could surge, potentially increasing the need for GPUs in the long run.

511
Frame generation

Do you think frame generation is a good thing?

About the Author

Hassam boasts over seven years of professional experience as a dedicated PC hardware reviewer and writer.