In January this year, an announcement from China rocked the world of artificial intelligence. The firm DeepSeek released its powerful but cheap R1 model out of the blue — instantly demonstrating that the United States was not as far ahead in AI as many experts had thought.
Behind the bombshell announcement is Liang Wenfeng, a 40-year-old former financial analyst who is thought to have made millions of dollars applying AI algorithms to the stock market before using the cash in 2023 to establish DeepSeek, based in Hangzhou. Liang avoids the limelight and has given only a handful of interviews to the Chinese press (he declined a request to speak to Nature).
Liang’s models are as open as he is secretive. R1 is a ‘reasoning’ large language model (LLM) that excels at solving complex tasks — such as in mathematics and coding — by breaking them down into steps. It was the first of its kind to be released as open weight, meaning that the model can be downloaded and built on for free, so has been a boon for researchers who want to adapt algorithms to their own field. DeepSeek’s success seems to have prompted other companies in China and the United States to follow suit by releasing their own open models.
Despite R1 having many capabilities that are on a par with the best US models, including those powering ChatGPT, its training costs were much less than those of rival companies, say AI experts. Training costs for Meta’s Llama 3 405B model, for example, were more than ten times greater. DeepSeek’s bid for transparency extended to publishing the details of how it built and trained R1 when, in September, the model became the first major LLM to undergo the scrutiny of peer review (D. Guo et al. Nature 645, 633–638; 2025). By releasing its recipe, DeepSeek taught other AI researchers how to train a reasoning model.
In many ways, “DeepSeek has been hugely influential”, says Adina Yakefu, a researcher at the community AI platform Hugging Face, which is based in New York City.
The heights of AI are a far cry from the village in Guangdong province where Liang was raised as the child of two primary-school teachers. Higher education took him to the prestigious Zhejiang University in Hangzhou, where he graduated with a master’s in engineering in 2010; his thesis involved crafting algorithms to track objects in videos. He soon applied his love of AI to financial markets and, in 2015, co-founded the hedge fund High-Flyer, spinning off DeepSeek in 2023.
At that time, China faced a hurdle in developing LLMs. US export controls prevented Chinese firms from buying certain powerful computer chips known as graphics processing units (GPUs) made by the US chip manufacturer NVIDIA, which are suitable for training LLMs. But Liang was already well provisioned. He had spent the previous decade purchasing 10,000 NVIDIA GPUs, fuelled by curiosity about what research could be done on them. In a 2023 interview with Chinese media company 36Kr, he likened their purchase to someone buying a piano for their home: “One can afford it, and there’s a group eager to play music on it.”
Like many Western AI entrepreneurs, Liang has set his sights on achieving artificial general intelligence — AI systems as adept as humans in cognitive tasks — and he has shaped his company around this, says Benjamin Liu, a former researcher at DeepSeek. The company prioritizes a person’s potential over their level of experience when hiring (one author on the DeepSeek R1 paper is still in secondary school) and it operates with little hierarchy, with researchers deciding what to work on themselves. Liang is said to be closely involved in research, and “even interns like myself were treated as full-time employees with meaningful responsibilities”, says Liu.

