DeepSeek's AI Training Continues on NVIDIA Hardware as Huawei Chips Prove Unstable for Large-Scale Models

Image for DeepSeek's AI Training Continues on NVIDIA Hardware as Huawei Chips Prove Unstable for Large-Scale Models

Leading Chinese AI firm DeepSeek has reportedly reverted to using NVIDIA GPUs for training its advanced large language models, after earlier attempts to rely on domestic alternatives like Huawei's Ascend chips encountered significant stability issues for large-scale workloads. This development highlights the persistent challenges facing China's ambition for self-sufficiency in cutting-edge AI hardware, despite algorithmic innovations by companies like DeepSeek.

The shift back to NVIDIA hardware for training, while Huawei's chips are still utilized for inference, underscores a critical dependency on foreign technology for demanding AI development. Sources familiar with DeepSeek's projects indicated that Huawei's Ascend processors suffered from unstable performance, weaker interconnect bandwidth, and a lack of mature software tools, which are crucial for training models with billions of parameters. DeepSeek, founded by Liang Wenfeng, has been noted for its algorithmic efficiency, reportedly training its V3 model for significantly less cost and computing power than comparable Western models.

NVIDIA CEO Jensen Huang has been a vocal critic of U.S. export controls on AI chips to China, arguing that these restrictions are counterproductive. Huang has stated that such policies harm American businesses, accelerate China's domestic chip development, and risk the U.S. losing its global AI leadership. "China is one of the world’s largest AI markets and a springboard to AI success," Huang remarked, emphasizing the importance of NVIDIA's continued presence and its CUDA software ecosystem.

Despite China's substantial investment in domestic chip design and production, the gap in advanced semiconductor manufacturing and software maturity remains significant. Reports suggest that DeepSeek, among other Chinese firms, has resorted to acquiring restricted NVIDIA GPUs through shell distributors and backchannels to circumvent export regulations. This ongoing reliance on NVIDIA for core training capabilities, coupled with the instability of domestic alternatives for complex tasks, illustrates the intricate geopolitical and technological dynamics of the global AI race.