Thursday, January 30, 2025
No menu items!
HomeNatureHow China created AI model DeepSeek and shocked the world

How China created AI model DeepSeek and shocked the world

The DeepSeek app logo displayed on a mobile phone.

The DeepSeek-R1 large language model can perform some tasks at a level that rivals models made by OpenAI, the developer of the chatbot ChatGPT.Credit: Nicolas Tucat/AFP via Getty

Chinese technology start-up DeepSeek has taken the tech world by storm with the release of two large language models (LLMs) that rival the performance of the dominant tools developed by US tech giants — but built with a fraction of the cost and computing power.

On 20 January, the Hangzhou-based company released DeepSeek-R1, a partly open-source ‘reasoning’ model that can solve some scientific problems at a similar standard to o1, OpenAI’s most advanced LLM, which the company based in San Francisco, California, unveiled late last year. And earlier this week, DeepSeek launched another model called Janus-Pro-7B, which can generate images from text prompts much like OpenAI’s DALL-E 3 and Stable Diffusion, made by Stability AI in London.

If DeepSeek-R1’s performance surprised many people outside of China, researchers inside the country say the start-up’s success is to be expected and fits with the government’s ambition to be a global leader in artificial intelligence (AI).

It was inevitable that a company such as DeepSeek would emerge in China, given the huge venture-capital investment in firms developing LLMs and the many people who hold doctorates in science, technology, engineering or mathematics fields, including AI, says Yunji Chen, a computer scientist working on AI chips at the Institute of Computing Technology of the Chinese Academy of Sciences in Beijing. “If there was no DeepSeek, there would be some other Chinese LLM that could do great things.”

In fact, there are. On 29 January, tech behemoth Alibaba released its most advanced LLM so far, Qwen2.5-Max, which the company says outperforms DeepSeek’s V3, another LLM the firm released in December. And last week, Moonshot AI and ByteDance released new reasoning models, Kimi 1.5 and 1.5-pro, which the companies claim can outperform o1 on some benchmark tests.

Government priority

In 2017, the Chinese government announced its intention for the country to become the world leader in AI by 2030. It tasked the industry with completing major AI breakthroughs “such that technologies and applications achieve a world-leading level” by 2025.

Developing a pipeline of ‘AI talent’ became a priority. By 2022, the Chinese ministry of education had approved 440 universities to offer undergraduate degrees specializing in AI, according to a report from the Center for Security and Emerging Technology (CSET) at Georgetown University in Washington DC. In that year, China supplied almost half of the world’s leading AI researchers, while the United States accounted for just 18%, according to the think tank MacroPolo in Chicago, Illinois.

DeepSeek probably benefited from the government’s investment in AI education and talent development, which includes numerous scholarships, research grants and partnerships between academia and industry, says Marina Zhang, a science-policy researcher at the University of Technology Sydney in Australia who focuses on innovation in China. For instance, she adds, state-backed initiatives such as the National Engineering Laboratory for Deep Learning Technology and Application, which is led by tech company Baidu in Beijing, have trained thousands of AI specialists.

Exact figures on DeepSeek’s workforce are hard to find, but company founder Liang Wenfeng told Chinese media that the company has recruited graduates and doctoral students from top-ranking Chinese universities. Some members of the company’s leadership team are younger than 35 years old and have grown up witnessing China’s rise as a tech superpower, says Zhang. “They are deeply motivated by a drive for self-reliance in innovation.”

Wenfeng, at 39, is himself a young entrepreneur and graduated in computer science from Zhejiang University, a top institution in Hangzhou. He co-founded the hedge fund High-Flyer almost a decade ago and established DeepSeek in 2023.

Jacob Feldgoise, who studies AI talent in China at the CSET, says national policies that promote a model development ecosystem for AI will have helped companies such as DeepSeek, in terms of attracting both funding and talent.

But despite the rise in AI courses at universities, Feldgoise says it is not clear how many students are graduating with dedicated AI degrees and whether they are being taught the skills that companies need. Chinese AI companies have complained in recent years that “graduates from these programmes were not up to the quality they were hoping for”, he says, leading some firms to partner with universities.

‘Efficiency under constraints’

RELATED ARTICLES

Most Popular

Recent Comments