Thursday, February 5, 2026
No menu items!
HomeNatureOpen-source AI tool beats giant LLMs in literature reviews — and gets...

Open-source AI tool beats giant LLMs in literature reviews — and gets citations right

Shelves full of books curving away within the distance in a large library.

OpenScholar is an LLM that performs scientific literature reviews using a database of 45 million open-access articles.Credit: dpa via Alamy

Researchers have published the recipe for an artificial-intelligence model that reviews the scientific literature better than some major large language models (LLMs) are able to, and gets the citations correct as often as human experts do.

OpenScholar — which combines a language modelwith a database of 45 million open-access articles — links the information it sources directly back to the literature, to stop the system from making up or ‘hallucinating’ citations.

Several commercial AI-based literature-review tools already exist that use similar techniques, but few have been released as open source, says Akari Asai, an AI researcher at Carnegie Mellon University in Pittsburgh, Pennsylvania, and a co-author of the work, published in Nature on 4 February1. Being open source means that researchers can not only try OpenScholar for free in an online demonstration, but also deploy it on their own machine and use the method in the paper to boost the literature-review skills of any LLM, says Asai.

In the 14 months since OpenScholar was first published in the arXiv repository2, AI firms such as OpenAI have used similar methods to tack ‘deep research’ tools onto their commercial LLMs, which has greatly improved their accuracy. But as a small and efficient system, running OpenScholar costs a fraction of the price of using OpenAI’s GPT-5 with deep research, co-author Hannaneh Hajishirzi, a computer scientist at the University of Washington in Seattle, tells the Nature podcast.

However, the authors acknowledge that OpenScholar has limitations. For example, it doesn’t always retrieve the most representative or relevant papers for a query, and it is limited by the scope of its database.

But if researchers are able to access the tool for free, “it can become one of the most popular apps for scientific searches,” says Mushtaq Bilal, a researcher at Silvi, a Copenhagen-based firm that has its own AI-based literature-review tool.

Outperforming humans?

LLMs can write fluently, but they often struggle with citations. This is because they learn by building links between words in their training data, which include sources outside science, and then generate text on the basis of probable associations that are not always correct or up to date. This is a feature of LLMs, not a bug, and it is proving to be a problem when people use LLMs in research. For example, at least 51 papers accepted to the high-profile machine learning NeurIPS conference in December 2025, contained non-existent or inaccurate citations, according to an analysis using the GPTZero tool.

RELATED ARTICLES

Most Popular

Recent Comments