DeepSeek is even more efficient than Nvidia, says analyst, and the industry could copy them

Table of Contents
When DeepSeek first launched, it made a big impact in the AI market, largely due to its low computational requirements. But even more impressive was the fact that, despite needing so little power, it managed to outperform AI models from tech giants like OpenAI. Fast forward to today, and we are still uncovering just how efficient DeepSeek really is and whether this efficiency comes with trade-offs or if DeepSeek has simply cracked the code.
These questions stem from a recent analysis highlighting that DeepSeek serves tens of millions of daily active users (DAU) with just 2,000 GPUs. This is an astonishing feat compared to competitors like OpenAI and xAI, which rely on vastly larger GPU clusters. For instance, xAI’s latest Grok 3 AI model is powered by Colossus, a supercomputer equipped with 200,000 Nvidia GPUs.
Nvidia follows DeepSeek’s optimization methods
According to the analysis, DeepSeek’s efficiency shows that a single H20 node (8 GPUs) can serve about 600 users. This means that while a service like WeChat would traditionally require around 400,000 GPUs to support 40 million concurrent users at 20 TPS per user, DeepSeek’s optimizations reduce this need to around 100,000–200,000 GPUs by operating at 10 TPS per user.
Today's best deals
- Intel Core Ultra 245K - 15% OFF NOW!
- ASUS ROG Swift PG32UQXR - $200 OFF NOW!
- Yeyian Yumi RTX 4060 Gaming PC - $500 OFF NOW!
- SAMSUNG 990 PRO 4TB SSD - 35% OFF NOW!
- Sony X77L 4K Smart TV - 16% OFF NOW!
- Samsung Galaxy Tab A9+ - 29% OFF NOW!
- WD_BLACK 8TB SN850X SSD - 32% OFF NOW!
*Stock availability and pricing subject to change depending on retailer or outlet.
“But DeepSeek has had even fewer GPUs from the very beginning, and they even had to use downgraded GPUs like A800/H20. However, they can squeeze the performance of the existing GPUs to the extreme, and their optimizations are even more effective than the official optimizations provided by NVIDIA.”
Source: Wukong, Substack
The report notes that “DeepSeek’s underlying infrastructure optimization capabilities are the most underestimated. And it can be copied by the industry.”
On top of that, unlike major tech companies that scale with high-end GPUs, the research reveals that DeepSeek has relied on downgraded GPUs like A800 and H20 from the start. Yet, despite this constraint, it has pushed hardware performance to the extreme, surpassing even NVIDIA's own optimizations. As a result, NVIDIA engineers have shared that the company is now working to integrate DeepSeek's optimization methods.
But unfortunately, there’s a tradeoff
Now, since DeepSeek reportedly serves tens of millions of DAUs with just 2,000 GPUs, a fraction of what other AI services require, this suggests that DeepSeek prioritizes efficiency over user experience. Unlike mainstream AI chatbots, which allocate more computing resources for lower latency and faster responses, DeepSeek users often have to wait longer for replies.
That said, DeepSeek’s success proves that better software optimization can achieve similar results with far fewer resources, unlike most large companies that focus on expanding GPU clusters. If more companies follow this approach, the AI industry could shift toward lower costs, greater accessibility, and broader adoption. However, the Jevons Paradox suggests that as computing power becomes cheaper, demand for AI applications could surge, potentially increasing the need for GPUs in the long run.