The DeepSeek episode: separating fact and fiction

A dose of reality in response to the DeepSeek flap over the last week

Did you receive this forwarded from a friend?

DeepSeek stirs up the AI community and beyond

DeepSeek is an AI company in China and the developer of a series of eponymous large language models (LLMs), most recently DeepSeek-R1. The bottom line up front is that R1 shocked the AI community because:

  • it is nearly on par with OpenAI's o1 model

  • it was supposedly trained for much lower cost than comparable models

  • DeepSeek is offering R1 as a cloud service at much lower cost than OpenAI’s o1

  • DeepSeek published a very good research paper1 that shows novel results in using a pure-reinforcement learning approach — an unexpected development.

  • unlike OpenAI, DeepSeek R1’s chain-of-thought traces (the record of the model “thinking to itself” verbally) are unredacted, which is interesting and helpful for other researchers

Throughout the last week, DeepSeek’s legend grew, with each new thread on X seemingly competing to explain a new way that DeepSeek’s R1 means the end of the line for OpenAI, Nvidia, Meta, Anthropic, the U.S. AI industry in general, and proves the uselessness of the U.S. chip export controls. That escalated quickly.

All of this buzz was enough for the DeepSeek discussion to break through to coverage in the legacy media — driving downloads of their consumer app, more panicked takes, market moves for Nvidia. Oops.

Misguided reactions vs reality

The key facts

Here’s what’s real: DeepSeek released their R1 model along with a series of smaller models distilled from their R1 model and open sourced. These models perform very well, and are small enough to run on a broad set of hardware either as released, or after quantization to shrink them further. The performance aspect is notable because it is priced around 30x lower per token than OpenAI’s o1 (although DeepSeek seems to lack the necessary capacity to serve demand). DeepSeek has a very good team of researchers and engineers that have made a real contribution to the field with advancements described in the R1 paper1 . To top it all off, they open sourced the whole thing. Fantastic.

Now let’s run through some of the misguided claims being thrown around.

Does DeepSeek’s R1 release mean export controls are dumb?

No. The U.S. export controls have been less effective than they could have been due to a hesitant introduction, and drafting of the controls weakened by concessions to the semiconductor industry. Early on, the U.S. made errors by telegraphing the intention to introduce controls, then implemented compromised rules that were easy to evade. China’s response, naturally, was to stockpile equipment and chips, and to implement measures to acquire leading-edge GPUs via the grey market. The new U.S. “Framework for Artificial Intelligence Diffusion”2  3 makes the controls much tighter.

Dylan Patel of SemiAnalysis, one of the best-informed semiconductor industry analysts, reported back in November 2024 that DeepSeek had 50,000 Nvidia H100’s4 , an export-controlled GPU. And there’s not a lot of mystery about how they got there: Almost a quarter of Nvidia’s revenue is coming from Singapore, and Nvidia Singapore PTE Ltd is headquartered in… Hong Kong5 . The evasion of export controls includes chipmaking equipment.

Nvidia revenue, Singapore highlighted - source: Quartr

Does a 30x performance edge mean U.S. AI companies are dumb and in trouble?

U.S. chip export controls, although leaky, created an environment where efforts to improve training and inference efficiency will be prioritized. This shouldn't be a surprise. U.S. AI companies in contrast have neglected efficiency. DeepSeek in particular

However — the idea that a (temporary, and open sourced) 30x constant performance factor advantage is a threat to OpenAI, or the whole American AI industry, is absurd if you understand the context that AI is developing as an exponential process. The performance advances that are driving AI advances are arriving steadily at a rate of many orders of magnitude (OOMs) of effective compute every few years. By reasonable estimates (shown below), the period of 2023-2027 will see about a 5 OOM increase, or 100,000x in effective AI compute scale-up. 30x represents about 1.5 OOMs -- this is significant but not determinative in the larger context.

One projection of expected effective compute scale-up6

Furthermore - this 30x is available for the picking now, with existing and future Nvidia GPUs, making them more valuable than they were last month. Tell me again how Nvidia is in trouble?

What about DeepSeek’s training cost advantage? This proves the U.S. companies are just wasting money and GPUs right?

Back to that great DeepSeek R1 research paper. We know that at a minimum DeepSeek has omitted details of certain resources used in the training of their models. This is not to lessen their achievement in creating their models, but it should be a caution not to accept wild conclusions about DeepSeek’s superiority as serious analysis.

First, DeepSeek has a significant quantity of Nvidia H100’s. They are export controlled, so it’s not surprising they don’t want to draw attention to this. Instead they report having used H800’s, the inferior non-export controlled sibling of the H100 GPU.

Second, there appears to be strong evidence that DeepSeek used a technique called distillation to extract important training data from OpenAI API’s in order to develop their models. Calling this theft is going too far, but pointing out that their capabilities were enabled by OpenAI technology is not. Again, the extreme interpretations of DeepSeek’s superiority are ill-founded.

Nvidia’s Stock Drop — Alternative Explanations

Nvidia's decline on the news of DeepSeek may not be on the DeepSeek news at all. Seen in the context of Trump's Monday Jan 27th statement that the administration will be "placing tariffs on foreign production of computer chips, semiconductors [...]", specifying a tariff rate of up to 100%. He continued: "we're going to look at pharmaceuticals/drugs, chips/semiconductors, we're going to look at steel and some other industries".

Interpreting Trump's remarks always has a hint of Kremlinology to it, but it's clear that this statement is targeting critical industries for which the U.S. has strategic vulnerabilities today. I can't help recognizing that implementing a “shock therapy” approach to the development of U.S. self sufficiency is exactly what one would expect if the U.S. prioritized critical industry resilience in a negative scenario for Taiwan. How hard would it be for national security hawks to push Trump in the direction of tariffs for this purpose, his self-professed “most beautiful word in the dictionary”?7

Nvidia's market decline this past wasn't unique among semiconductor firms -- TSMC, ASML, AMD and others also showed significant declines on Monday. Trump's tariff announcement came on Monday. In his statement he also derided the bipartisan CHIPS act, passed during Biden's term and important to U.S. semiconductor firms. The point is, to put it plainly, the attribution of Nvidia's market decline to the DeepSeek news is not well supported.

DeepSeek is a real competitor, and China is a real competitor. But we knew that already.

What did you think of this issue?

Login or Subscribe to participate in polls.

Have feedback? Are there topics you’d like to see covered?
Reach out at: jeff @ roadtoartificia.com

1  DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

2  Regulatory Framework for the Responsible Diffusion of Advanced Artificial Intelligence Technology

3  Understanding the Artificial Intelligence Diffusion Framework: Can Export Controls Create a U.S.-Led Global Artificial Intelligence Ecosystem? - RAND

4  Dylan Patel on X: “Deepseek has over 50k Hopper GPUs to be clear. People need to stop acting like they only have that 10k A100 cluster. They are omega cracked on ML research and infra management but they aren't doing it with that many fewer GPUs” 

5  @DarioCpx on X: “Looks like $NVDA GPUs have never been physically in Singapore […] Guess what.. “Nvidia Singapore PTE Ltd” is headquartered in… Hong Kong!”

6  Situational Awareness, I. From GPT-4 to AGI: Counting the OOMs, Compute - Leopold Aschenbrenner

7  WSJ: Trump Calls Tariffs the ‘Most Beautiful Word’