
Artificial-intelligence translators could help researchers to meet the arXiv preprint server’s new mandate that all manuscripts be submitted in English.Credit: Sharaf Maksumov/Alamy
Every month, more than 20,000 scientific manuscripts by authors from around the world are posted on the preprint repository arXiv, the oldest and best-known preprint site. Now researchers uploading their work to the site are facing a new requirement: from 11 February, all submissions must be either written in English or accompanied by a full English translation.
Until now, authors have had to submit only an abstract in English. Staff at arXiv say that the English rule will make life easier for its moderators and keep its readership broad. “We can’t be fair in judging papers if they are not in English,” says Ralph Wijers, the chair of the arXiv editorial advisory council and an astronomer at the University of Amsterdam, whose native language is Dutch. The site, based at Cornell University in Ithaca, New York, does not undertake peer review, but a team of some 300 volunteer moderators verifies that submissions are “appropriate and topical”.
Scientists hide messages in papers to game AI peer review
ArXiv hosts nearly 3 million preprints across eight subject areas, although the vast majority of the manuscripts are in computer science, physics and mathematics. Just 1% of submissions are in a language other than English. Nonetheless, the revised language policy has prompted some vocal complaints, including arguments that the burden of the mandate might deter people from making content such as PhD theses and preprints of textbook chapters public. Authors of such texts might think it is not worth the effort to translate them or to find an alternative venue for making them accessible
“I personally see it as a loss for our community,” says mathematician Angelo Lucia at the Polytechnic of Milan in Italy.
Several French mathematicians posted on the arXiv announcement saying they might take their manuscripts to the French preprint server HAL (Hyper Articles en Ligne), instead. HAL hosts works in several languages including English, French and Spanish, without requiring translations.
Machine translation
The arXiv policy specifies that automated translations, such as those done by artificial-intelligence chatbots, are acceptable, so long as they are faithful to the original work.
Editors at arXiv have some reservations about these systems’ capabilities, however. “Our advice is: feel free to use an AI or an LLM [large language model] to translate your text, but please check it,” says Wijers. “Our own experience is that AI translation is good but not good enough.”
This caution echoes that expressed by respondents to a Nature survey in 2025 of more than 5,000 researchers from around the world (respondents included both volunteers and randomly selected authors of recent papers). Although more than 90% of those surveyed felt it was acceptable to use AI to translate a paper into another language (and 8% had done so), more than half said this would be appropriate only if the translation was checked by a native speaker.
Delving deep
LLMs are widely considered to be excellent at generating conversational text, but limited attention has been given to their prowess at translating scientific papers.
James Zou, a computer scientist at Stanford University in California, and Hannah Kleidermacher, a doctoral student in electrical engineering also at Stanford, investigated one LLM’s ability to translate academic text from English into other languages. They asked GPT-4o — an LLM released by OpenAI in San Francisco, California, in 2024 — to create a 50-question multiple-choice quiz for each of six scientific papers in English across various topics, with an answer key. This produced an automated benchmark with which to evaluate the LLM’s performance. The authors then instructed the LLM to translate the six papers into 28 other languages, and take the quiz on the translated versions.


