DeepSeek achieved something spectacular in a matter of months while delivering a massive shock to the US stock market. The AI company released a wildly impressive ChatGPT rival called DeepSeek AI, and it went viral a few weeks ago. No other AI firm has achieved what DeepSeek did, not even Google.
The most impressive part was that, due to US sanctions, DeepSeek didn’t have access to the latest GPUs for AI development. So they came up with software tools to train an AI as well as OpenAI’s reasoning models at a fraction of the cost. This feat wiped $1 trillion from the US stock market, as investors were spooked that hardware would not continue to be the most important thing in AI development.
While those worries might have been exaggerated, DeepSeek isn’t stopping. The company plans to launch a big DeepSeek R2 reasoning model upgrade, and it’s rushing to have it out by May. China is still banned from accessing the latest chips, so DeepSeek R2 development will rely on whatever GPU stockpiles DeepSeek might have smuggled, as well as software optimizations.
But it’s not just software or hardware that DeepSeek might excel at when competing against OpenAI and other Western AI firms. A report detailing DeepSeek’s unconfirmed plans to release R2 by May also reveals the secret sauce that made the DeepSeek R1 breakthroughs possible. Apparently, the people working at DeepSeek love it there thanks to a company culture and business practices that are uncommon among big Chinese tech firms.
Since DeepSeek R1 came out, OpenAI has released new reasoning models, including the o3-mini and o3-mini-high. OpenAI also plans to release GPT-4.5 in the coming weeks, with a larger GPT-5 upgrade to follow. This might explain the pressure on DeepSeek to rush out its own upgrades.
Per Reuters, R2 is set to arrive before May, a few weeks earlier than expected. The new model should be even better at coding than R1 and will supposedly introduce support for multilingual reasoning.
DeepSeek R2 should continue to make use of software innovations that DeepSeek already employed for its existing models. The Mixture-of-Experts (MoE) tech allows DeepSeek to activate only the parts of an AI model required to handle a task. Then there’s Multihead Latent Attention (MLA), which lets DeepSeek AI process multiple aspects of a prompt at once.
All that happens without access to the latest hardware, but DeepSeek is still seeing big efficiency gains over rivals. The report notes analyst estimations that DeepSeek pricing might be 20 to 40 times cheaper than ChatGPT tools. This puts pressure on OpenAI and Google to cut prices for ChatGPT and Gemini, respectively.
DeepSeek will continue to be unable to purchase the same Nvidia chips that AI firms have access to. The US may get even tougher on chip bans in the future, and I wouldn’t be surprised if the DeepSeek R2 upgrade further fuels future ban decisions.
I’m speculating here, but it’s based on what Reuters reports on the DeepSeek company culture. It looks like the people working there actually like being involved with the project, and it’s all because of how Liang Wenfeng manages the team.

Wenfeng is a 40-year-old billionaire who first employed AI for High-Flyer, a quantitative hedge fund. High-Flyer reinvested 70% of profits into AI research before ChatGPT was a viral product. A few years ago, the company purchased two AI supercomputing clusters. This included Fire-Flyer II, which included 10,000 Nvidia A100 chips.
Those chips were banned from purchase in China in 2022, and DeepSeek used them to train the DeepSeek AI models. Rumors say the company may have smuggled tens of thousands of additional chips since then, but that’s something that will probably never be confirmed.
Back to Wenfeng, the DeepSeek founder created the company as a research lab rather than a for-profit AI firm. He instituted a different management style, avoiding the traditional “996” work culture from other tech firms. That refers to “9 AM to 9 PM six days a week.” Meanwhile, DeepSeek researchers work 8-hour days.
It’ll be interesting to hear whether Wenfeng kept his management style unchanged while pushing the DeepSeek R2 development, especially considering the report’s claim that the company wants to have the R2 model out sooner than planned. I can’t help but wonder whether working 8-hour days is enough for that.
The report further reveals that Wenfeng recruited young engineers fresh from school, working side-by-side with them and allowing them to take ownership of DeepSeek research projects. These engineers are also very well paid. Senior High-Flyer data scientists might make about 1.5 million yuan annually, or around $206,000. That’s about double the rate of competitors.
All of that is not to say that ChatGPT engineers do not enjoy their work or aren’t paid handsomely. But we’ve heard of dozens of high-ranking OpenAI execs and former co-founders who left the firm to start their own AI ventures. Then again, we shouldn’t expect the same level of transparency from Chinese companies. The Reuters report might paint a rosier picture than it actually is.
However, the report also notes that DeepSeek has quickly become a success story in China, one that Beijing fully embraces. It’s not just DeepSeek engineers who might love the firm. The government might have investigated High-Flyer’s big AI chip purchases a few years ago, including that 10,000-chip cluster, but DeepSeek is now immensely popular. DeepSeek AI is now being integrated in various areas.
Some 13 major city governments and 10 state-owned energy companies now use DeepSeek AI. Tech giants such as Baidu, Lenovo, and Tencent have also begun adopting it.
While Reuters’ story can’t be confirmed, it sure looks like DeepSeek is growing in popularity with Chinese companies and the government, and that sort of support can further improve the firm’s ability to compete against OpenAI, Google, and other big AI firms.
Meanwhile, the Western world is ready to implement DeepSeek bans. That’s not surprising. DeepSeek might have gone viral, and Reuters paints a great picture of the company’s inner workings, but the AI still has issues that Western markets can’t tolerate.
Countries like Italy and South Korea have already announced bans on DeepSeek AI. The US government is also mulling a wider ban. The ban is related to user data privacy. All DeepSeek data goes to China. DeepSeek also has other issues, including widespread censorship of China-related topics and general AI safety concerns.
With all that in mind, it’s clear the DeepSeek R2 release coming by May can’t shock the markets like its predecessor did. But it’ll certainly be interesting to see how R2 competes against ChatGPT, Gemini, and others come spring.