Join Us Friday, February 21
  • Investors misinterpreted DeepSeek’s AI advancements, Nvidia CEO Jensen Huang said.
  • DeepSeek’s large language models were built with weaker chips, rattling markets in January.
  • Huang emphasized the importance of AI post-training in a pre-taped interview released Thursday.

Investors took away the wrong message from DeepSeek’s advancements in AI, Nvidia CEO Jensen Huang said at a virtual event aired Thursday.

DeepSeek, a Chinese AI firm owned by hedge fund High-Flyer released a competitive, open-source reasoning model named R1 in January. The firm said the large language model underpinning R1 was built with weaker chips and a fraction of the funding of the predominant, Western-made AI models.

Investors reacted to this news by selling off Nvidia stock, resulting in a $600 billion loss in market capitalization. Huang himself temporarily lost nearly 20% of his net worth in the rout. The stock has since recovered much of its lost value.

The dramatic market response stemmed from investors’ misinterpretation, Huang said in Thursday’s pre-recorded interview, which was produced by Nvidia partner DDN and part of an event debuting DDN’s new software platform, Inifinia.

Investors have raised questions as to whether trillions in spending on AI infrastructure by Big Tech firms is needed, if less computing power is required to train models. Jensen said the industry still needs computing power for post-training methods, which allow AI models to draw conclusions or make predictions after training.

As post-training methods grow and diversify, the need for the computing power Nvidia chips provide will also grow, he continued.

“From an investor perspective, there was a mental model that the world was pre-training and then inference. And inference was: you ask an AI a question, and you instantly got an answer,” he said at Thursday’s event. “I don’t know whose fault it is, but obviously that paradigm is wrong,”

Pre-training is still important, though post-training is the “most important part of intelligence. This is where you learn to solve problems,” Huang said.

DeepSeek’s innovations energize the AI world, he said.

“It is so incredibly exciting. The energy around the world as a result of R1 becoming open-sourced — incredible,” Huang said.

Nvidia spokespeople have addressed the market reaction with written statements to a similar effect, though Huang had yet to make public comments on the topic until Thursday’s event.

Huang has been defending against the growing concern that model scaling is in trouble for months. Even before DeepSeek burst into the public consciousness in January, reports that model improvements at OpenAI were slowing down roused suspicions that the AI boom might not deliver on its promise — and Nvidia, therefore, wouldn’t continue to cash in at the same rate.

In November, Huang stressed that scaling is alive and well; it has simply shifted from training to inference. Huang also said Thursday that post-training methods are “really quite intense,” and models will keep improving with new reasoning methods.

Huang’s DeepSeek comments may serve as a preview for Nvidia’s first earnings call of 2025, scheduled for February 26. DeepSeek has become a popular topic of discussion on earnings calls for companies across the tech spectrum, from Airbnb to Palantir.

Nvidia rival AMD was asked the question earlier this month, and CEO Lisa Su said DeepSeek is driving innovation that’s “good for AI adoption.”



Read the full article here

Share.
Leave A Reply