Millions of research articles are absent from major digital archives. This worrying finding, which Nature reported on earlier this year, was laid bare in a study by Martin Eve, who studies technology and publishing at Birkbeck, University of London. Eve sampled more than seven million articles with unique digital object identifiers (DOIs), a string of characters used to identify and link to specific publications, such as scholarly articles and official reports. Of these, he found that more than two million were ‘missing’ from archives — that is, they were not preserved in major archives that ensure literature can be found in the future (M. P. Eve J. Libr. Sch. Commun. 12, eP16288; 2024).
Eve, who is also a research developer at Crossref, an organization that registers DOIs, carried out the study in an effort to better understand a problem librarians and archivists already knew about — that although researchers are generating knowledge at an unprecedented rate, it is not necessarily being stored safely for the future. One contributing factor is that not all journals or scholarly societies survive in perpetuity. For example, a 2021 study found that a lack of comprehensive and open archiving meant that 174 open-access journals, covering all major research topics and geographical regions, vanished from the web in the first two decades of this millennium (M. Laakso et al. J. Assoc. Inf. Sci. Technol. 72, 1099–1112; 2021).
Millions of research papers at risk of disappearing from the Internet
A lack of long-term archiving particularly affects institutions in low- and middle-income countries, less-affluent institutions in rich countries and smaller, under-resourced journals worldwide. Yet it’s not clear whether researchers, institutions and governments have fully taken the problem on board. “Preservation is an issue and it’s an issue that everyone flags, but it’s not an easy issue to solve,” says Iryna Kuchma, the open-access programme manager at Electronic Information for Libraries, a non-profit organization in Vilnius that aims to improve people’s access to digital information.
“More and more journals are being established with less and less checks and balances,” says Ginny Hendricks, chief programme officer at Crossref, who is based in London. “You’ve got the big publishers, who are doing a decent job, but then there’s half the journals in the world that are run on a shoestring, and it costs them money to have some kind of service from preservation networks, if they even know about them.”
For this Editorial, Nature asked librarians, archivists, scholars and international organizations for suggestions on how to improve the situation. Researchers, institutions and funders should take note of what they can do to help.
At the heart of the problem is a lack of money, infrastructure and expertise to archive digital resources. “Digital preservation is expensive and also quite difficult,” says Kathleen Shearer, who is based in Montreal, Canada, and is the executive director of the Confederation of Open Access Repositories, a global network of scholarly archives. “It is not just about creating backup copies of things. It is about the active management of content over time in a rapidly evolving technological environment.”
For institutions that can afford it, one solution is to pay a preservation archive to safeguard content. Examples include Portico, based in New York City, and CLOCKSS, based in Stanford, California, both of which count a raft of publishers and libraries as customers.
But archiving is often not prioritized when money is tight, as it generally is for publishers in low-resource settings. “That is more of a challenge because a lot of these journals are small and they’re more at risk because they don’t have their own robust infrastructures for platforms and preservation services themselves,” says Kate Wittenberg, Portico’s managing director.
Another option could be for institutions and funding bodies to include text and data archiving as a requirement in research projects, along with publishing papers. At a minimum, this would mean depositing work in institutional repositories, in cases in which such facilities exist. When they don’t, making archiving mandatory would compel researchers and their funding bodies to think hard and find solutions to meet an archiving requirement.
Making archiving obligatory would also encourage universities that don’t yet operate their own repositories to work towards instituting them. “Universities are one of the most enduring elements of our society,” says Hussein Suleman, a digital libraries scholar at the University of Cape Town in South Africa. “If we adopted this widely, this would be a safeguarding mechanism for the knowledge of our current generation so the future generations can access it.”
A further option is for more countries to implement ‘legal deposit libraries’ — keystone libraries into which authors or publishers are obliged to deposit new work. The concept was originally devised so that at least one institution always had a publicly available copy of every published book, but in some countries it has since been expanded to include research works. Further expanding it would not offer a complete solution, because material archived in legal deposit is not easy to find — but it could be done as an absolute minimum to ensure that copies of scholarship continue to exist if their originators are no longer able to support archiving. More or better coordination “between the big players globally” is also needed, says Hendricks. And global should not only mean Western, she adds.
Increasing people’s access to knowledge and increasing the visibility of new research is rightly a focus for global research-publishing policy. Archiving is core to this — and core to scholarship itself. As Eve told Nature in March: “Our entire epistemology of science and research relies on the chain of footnotes.” If access to this knowledge becomes more restricted, the research that survives will be dominated by institutions, such as those in Europe and the United States, that have the funds to safeguard their research in archives. Action must be taken now to ensure that records of the scholarship undertaken by everyone, everywhere, can exist in perpetuity.