DeepSeek is on Fire
How Silicon Valley VC investors are thinking about it?
DeepSeek is making waves today, and there's no need to elaborate—NVDA has lost nearly $600 billion in market value. Rumor has it that Jensen Huang has scrapped his plans to celebrate the Lunar New Year in Taiwan and is returning to the Bay Area instead.
But why is DeepSeek so hot? There are multiple reasons.
1. The Contrast and Underdog Narrative
For Americans, DeepSeek was a completely unknown small company that, with minimal costs and under strict chip restrictions, managed to provide an open-source and free alternative to products from OpenAI and Google. And the kicker? It’s a Chinese company.
For many non-paying, non-power users, they might not even have used OpenAI’s GPT-4o plus search features. People around the world love an underdog story—just three years ago, OpenAI played the role of the small company that dethroned Google, only to become an AI giant itself. And now, history repeats itself.
2. DeepSeek’s Innovations
The technical report lays it out clearly—there are plenty of innovations, though no game-changing breakthroughs. Most of its techniques—Mixture of Experts (MoE), Reinforcement Learning (RL), Multi-Level Attention (MLA), FP8, and Chain-of-Thought (CoT)—are known tricks in the industry.
However, no one else has put all these tricks together like DeepSeek. This is reminiscent of China’s "micro-innovation" approach: knowing enough tricks and successfully implementing them can yield astonishing results.
It’s premature to dismiss it as lacking disruptive innovation—given time, who knows? After all, DeepSeek faces constraints in time, money, compute, and talent. Even without disruptive breakthroughs, accumulating small innovations can still be incredibly powerful.
3. How Good is DeepSeek R1?
From the results—very good. It’s the best open-source model to date (emphasis on open-source weights), and it comes close to GPT-4o.
The fact that platforms like Perplexity and Groq have already integrated it speaks volumes—it’s definitely a strong model.
4. Why is DeepSeek Doing So Well?
Open-Source Community Victory
This aligns with Yan LeCun's view that AI is one of the most open high-tech fields. Research papers are public, AI researchers take pride in winning Best Paper awards at top conferences, and OpenAI itself was initially open-source.
DeepSeek also benefited heavily from Meta’s LLaMA open weights and technical reports, which played a significant role in its success.
China Has Talent
China's large population and rigorous college entrance exam system help identify and cultivate top talent. There are even unverified reports that Tsinghua University’s computer science graduates increasingly prefer to pursue PhDs at their own institution.
Chinese Engineers Work Extremely Hard
Anyone who has spent time in Silicon Valley knows how comfortable many Big Tech engineers have it—some (including Chinese employees) work from home most of the time or spend just 3 hours working in the office.
By contrast, DeepSeek’s "non-grueling" work culture still means employees work until 7 PM daily, which is already far more intense than what most Silicon Valley engineers do.
DeepSeek’s Strong Corporate Culture
CEO Liang Wenfeng’s remarks are worth noting. He emphasizes:
Avoiding toxic work pressure
Focusing on real contributions to the open-source community
Creating actual value
Actively recruiting young talent and giving them ample space and resources to experiment
This kind of culture enables a "small company" like DeepSeek to achieve stunning results.
DeepSeek is Not Actually “Small”
Those familiar with China’s quantitative hedge fund scene know that Huanteng (幻方, aka High-Fund) has long been one of China’s top two quant firms, at one point managing over 100 billion RMB.
More importantly, Huanteng was one of China’s earliest adopters of large-scale GPU clusters (over 10,000 GPUs). Even today, very few Chinese companies—and even fewer global firms—can efficiently utilize such resources.
DeepSeek enjoys internal funding, has ample compute, and avoids much of the bureaucracy and inefficiency that plague large companies. Compared to most startups, it also doesn’t have to worry about fundraising, allowing it to focus purely on R&D.
5. Why is DeepSeek’s Cost So Much Lower Than OpenAI/Claude?
Training and inference costs drop exponentially for the same level of performance. If OpenAI were to retrain GPT-4o/o1 today, the costs would be dramatically lower.
DeepSeek’s reported $5 million training budget and 2,048 GPUs don’t include the costs of data preparation, algorithm experimentation, and trial-and-error.
Big companies prioritize intelligence over cost efficiency, especially if they have ample compute. Unlike DeepSeek, OpenAI isn’t concerned about saving compute costs—it’s focused on building the most powerful models.
DeepSeek also traded off alignment for efficiency—alignment training is extremely expensive in terms of money, time, talent, and compute.
Anthropic wouldn’t release a model with such weak alignment.
Even at OpenAI, despite internal debates, they wouldn’t release something this underaligned either.
OpenAI’s past agility as a startup let it outmaneuver Google, but even it has since adopted a more corporate approach.
In the end, China’s strong AI talent, access to global research, and work ethic—combined with US chip restrictions, DeepSeek’s funding, and its focus on efficiency—led to an extremely cost-effective, high-performance model.
It’s no wonder US AI giants are feeling the heat.
6. Will AI Algorithms Become a Commodity?
It’s highly likely. In fact, NVIDIA may be happy about this.
If there are mature, stable open-source models, NVIDIA can simply optimize them and bundle them with GPUs, making it harder for AI companies to sell models separately. This is exactly how NVIDIA conquered autonomous driving—eventually, everyone just became NVIDIA’s workforce.
DeepSeek’s R1 technical report is one of the most open we’ve seen—it’s fantastic and inspiring. And there are rumors of even better models dropping soon—which, given today’s events, seems highly plausible.
7. What About NVIDIA?
NVIDIA’s moat spans multiple dimensions: compute, CUDA ecosystem, interconnects, and Total Cost of Ownership (TCO).
While AMD has surpassed NVIDIA in raw compute, CUDA still dominates, though it faces increasing competition from MLX, Triton, and JAX.
Large-scale interconnect issues may be solved by building giant monolithic chips, though this mostly applies to inference. Meanwhile, LPU (Language Processing Units) are trying to carve out a slice of the inference market.
Despite challengers, NVIDIA’s execution and iteration speed remain unparalleled. As one 20-year NVIDIA veteran put it:
"In the GPU race, history suggests that nobody beats NVIDIA—unless it’s on an entirely different battlefield."
8. Final Thoughts
DeepSeek R1’s distilled LLaMA models—even the 7B version—are significantly stronger.
As device compute power improves, on-device AI solutions will become more feasible, creating massive opportunities.
Yet, most AI companies today focus on cloud-based AI agents, ignoring efficiency and cost.
If you're working on this, please feel free to reach out to us.

