
Credit: Malik Evren/iStock via Getty
An artificial intelligence (AI) tool that scans manuscript titles and abstracts has flagged more than 250,000 cancer studies that bear textual similarities to articles that are known to have been produced by paper mills. These businesses produce fake or low-quality research papers and sell authorships.
Articles produced by paper mills often include fabricated data, duplicated images and weird phrases, which are strange wording choices used to evade plagiarism detectors. Integrity specialists and sleuths can spot these flaws, but the process is time-consuming and, in many cases, the involvement of paper mills cannot be proven so quantifying the scale of the problem is difficult.
But, paper mills probably rely on boilerplate templates to mass produce papers, says Adrian Barnett, a statistician at Queensland University of Technology in Brisbane, Australia, which could be detected by large language models (LLMs) that analyse patterns in texts. Barnett and his colleagues developed a model and posted their analysis1 on the preprint server bioRxiv last month. It has not yet been peer reviewed. They emphasize that their findings should be checked by human specialists and are not confirmed cases of research fraud.
Adam Day, the founder of research-integrity firm Clear Skies in London, says the analysis estimates are similar to those identified by a research-integrity screening software that his firm developed called the Papermill Alarm. But he cautions that the approach that the preprint authors used could be flagging legitimate papers and needs further verification.
Suspected paper mill
Barnett and his colleagues trained a language model called BERT to distinguish between ‘genuine’ cancer studies and retracted papers that were listed as involving ‘suspected paper-mill activity’ by a public database maintained by research-integrity blog Retraction Watch. The BERT model scans titles and abstracts for certain words and phrases that it associates with paper-mill activity, a similar process to filtering spam e-mails.
Retraction notices rarely state when a study was created to order by a paper-mill company, but Retraction Watch has developed its own criteria — on the basis of its reporting and reviews of thousands of notices — to assign retracted papers as suspected paper-mill articles, says Ivan Oransky, a specialist in academic publishing and a co-founder of Retraction Watch.
After screening abstracts and titles, the AI tool gave each article a probability score of how much it resembles suspected retracted paper-mill products. In a test of 276 retracted papers and 275 genuine papers that were not included in the training data, BERT was 91% accurate. The false-negative rate — the share of paper-mill articles that the tool failed to identify — was about 13% (37 out of 276). The false-positive rate — the number of genuine papers that it flagged incorrectly — was around 4% (12 out of 275).
The AI tool was then used to screen 2.6 million cancer-research papers — identified from the PubMed database of biomedical literature — that were published in 11,632 journals between 1999 and 2024. The tool identified 261,245 of the papers as suspected paper-mill articles, most of which were fundamental research studies.
The analysis also suggests that paper-mill activity has risen steeply over the past two decades. Only 1% of cancer-paper publications in the early 2000s were flagged by the AI tool as probably being produced by a paper mill, but this grew to more than 15% in the early 2020s, peaking in 2022 at 16.6% before declining in 2023 and 2024.
But, Day says the results probably include many legitimate papers. Having equal numbers of genuine and problematic papers in the training data does not accurately represent the research literature, in which fraudulent papers are much rarer. This mismatch, he says, could inflate the false-positive rate when applied to real-world data.
The team found “no evidence” that the proportion of predicted paper mill articles was inflated in their analysis, says coauthor Baptiste Scancar, a data scientist at the French Institute for Higher Education and Research in Food, Agriculture and the Environment in Rennes, France. “The true proportion of paper mill articles in cancer research is unknown and likely very high,” he adds. “We believe the figures reported in the manuscript are underestimates.”