Wednesday, April 1, 2026
No menu items!
HomeNatureWhy science has a credibility problem — and how to address it

Why science has a credibility problem — and how to address it

Brian Nosek, head of the Center for Open Science (COS), working at a communal area desk on a computer in an open-plan office.

Open-science advocate Brian Nosek is involved in an effort to assess whether the replicability of social-science results can be predicted.Credit: Center for Open Science

As a graduate student in psychology, Brian Nosek co-founded Project Implicit, a non-profit organization that uses fast-response online tests to look for hints of gender, racial and other biases in the unconscious associations people make between words. The Implicit Association Tests hosted by this project have been taken tens of millions of times since they were introduced in 1998, and are widely referenced across social-psychology research, pop culture and corporate training.

Nosek, now at the University of Virginia in Charlottesville, has since become a pioneer in the field of open science, where he has applied his psychological expertise to examine the gap between the values of science and its practice. In 2013, he co-founded the non-profit Center for Open Science (COS) in Charlottesville to address this gap in a practical way.

His work has not been without controversy. In 2024, he co-wrote a paper that tested a method for improving replicability (J. Protzko et al. Nature Hum. Behav. 8, 311–319; 2024). It was retracted after critics pointed out false claims about the preregistration of the studies in the paper — a situation that Nosek described at the time as a “screw-up”.

This week, Nosek and colleagues are releasing the results of a project called Systematizing Confidence in Open Research and Evidence (SCORE). The effort examined whether social-science papers hold up to reanalysis, either when using the same data or in fresh studies, and whether replicability can be predicted. Nature talked to Nosek about his work and what can be done to improve the practice of research.

What is the main gap that you see between scientific values and practice?

All researchers would say that, of course, transparency is important for science. I need to be able to interrogate your findings; you need to be able to interrogate mine. But the way in which science is structured doesn’t reward transparency. Publication is the currency of advancement, and decisions governing publication are largely about the outcomes that researchers produce, not about rigour, transparency or reproducibility.

Bias is ever-present — that’s what we have learnt from Project Implicit. The purpose of transparency is to help expose when bias occurs, and to give occasion to correct it.

How does the Center for Open Science attempt to close that gap?

Its mission is to increase openness, integrity and trustworthiness of research through research, policy advocacy and infrastructure, along with community building, training and education. It maintains the Open Science Framework (OSF), a project-management tool that now has around one million registered users. This allows researchers to be open about all stages of their work, from the design to the data, code, preprints and final publications. The centre also published the Transparency and Openness Promotion Guidelines in 2015, which were updated last year. These aim to increase the verifiability of research claims, and thousands of journals have adopted the policies in one form or another.

Which projects to improve open science have worked well?

My favourite is Registered Reports, a publishing model in which the authors submit their plan for an experiment before they conduct it. In this approach, the peer-review process evaluates whether the plan addresses an important question, and whether the methodology is an effective way to test that question. If it passes peer review, then the journal makes an in-principle commitment to the authors to publish the work, regardless of the outcome. I love this model because it doesn’t give up publication as the currency of advancement, but changes what it takes to get a publication. It’s no longer just about producing outcomes that are novel and positive and exciting. It’s about asking important questions and designing rigorous tests of those questions. More than 300 journals have adopted this approach, including Nature for papers on social and behavioural research and neuroscience.

Have there been reforms that you thought were great ideas, but which failed?

All the time. So many things go wrong, just as they do in science.

We recognized early on that an unstable Internet connection was one of the barriers to open science for low-income countries. And so we said, let’s build an open-source file-sharing solution so researchers can upload their files when the connection is working. And we just couldn’t get it done.

Also, we designed the OSF very flexibly at the outset so that people could be experimental. But as open science has moved into the mainstream, that flexibility has become a liability. Most people want more structured guidance. Now we need to update our infrastructure to catch up with the movement, which has outpaced us.

If you could change the science publication system in one way, what would it be?

Transparency from start to finish, including in the planning. Publication occurs much too late in the life cycle of science. One of the latest COS experiments is called Lifecycle Journal. The idea is to have the evaluation process occurring continuously alongside the research as it happens. We’re using the life-cycle framework for a pilot project with the technology company Meta to make data from the social-media app Instagram available to researchers studying young people’s well-being. That’s a very hot topic.

Seating and small tables in front a very long, tall set of shelves completely filled with books at a library.

Science is currently structured to reward publication rather than transparency.Credit: xu wu/Getty

The SCORE results are disturbing: only about half of the papers were successfully replicated or reproduced. What’s your view?

If you take the results as the end of the road, the numbers are terrifying. But they’re not the end of the road, they’re the start of a conversation.

To assess reproducibility, we took the same data as in the original paper and tried to apply the same analysis to see whether we got the same result. Only 53% of the 145 papers we tested reproduced precisely. But we have suggestive evidence that the papers didn’t report results wrongly, they just didn’t report enough. If we didn’t have to reconstruct the data and work out the analysis just from a description in the text, if authors had actually shared the code, then results would have reproduced beautifully.

To examine replicability, we subjected 164 papers to replication attempts using new data to see whether the same result came out: 49% replicated successfully according to the main criterion — statistical significance with the same pattern as in the original study.

But we don’t know what the optimal level of replicability is. In very active areas of research, 100% replicability might imply that the work is extremely conservative and does not push the boundaries of knowledge into the unknown. The important part is not the number — it’s that scientists say ‘wow, I had presumed that published findings are more repeatable than they are’. There is a hidden uncertainty in papers that is not recognized in the way it ought to be.

When you read a paper, how do you evaluate whether it’s likely to be credible?

There are a few things I look for immediately. If it’s a tiny sample size for the question, I say, ‘I’m not going to spend my time on this’. I tend to jump to the methodology: I want to understand what the researchers were doing, and how they made those decisions.

A core lesson from SCORE is that there is no single measure of trustworthiness, and there never will be. ‘Published or not’, for example, is a crude way to assess the quality of science. We take peer review as sacrosanct when everyone knows it’s not. Peer review is highly tentative, occurs at a single point in time, is ad hoc, permanent — and, in most cases, opaque.

Do you imagine a world in which there is a set of simple credibility indicators for research?

There is a lot of work on indicators of credibility. The research-evaluation platform scite.ai aims to determine how many times a paper’s findings have been supported or contradicted. A project called ERROR [Estimating the Reliability and Robustness of Research] is providing payments to people who identify errors in published papers. And other groups are using AI to compare whether research plans are consistent with the reported outcomes.

RELATED ARTICLES

Most Popular

Recent Comments