In a crisis, panic: that’s pretty much a working definition of a crisis. Fight or flight, not many will acquiesce to platitudes on their coffee cups and “Keep Calm and …whatever”. But look a little closer and you’ll see further sub-types of responder. There’s the Harry Enfield, Mr “You Don’t Want to Do That”. An object lesson in hindsight, these folk will explain how they knew all along that there was trouble brewing (although they were strangely quiet until the bandwagon got rolling). And then there are the white knights, riding in to solve the problem. These latter day Bruce Willis’s appear from nowhere, but boy do they know how to fix it! If nothing else, the fact that these archetypes are stalking the field at the moment strongly suggests there’s a crisis in modern psychology.
It’s hard not to pick up on the existential panic around reproducibility, exemplified by Nosek and colleagues’ 2015 paper. There can’t be any doubt that, as issues come (and go), it’s fundamental to psychological science. So well done to @healthpsycleeds and @BPSOfficial for hosting the recent excellent debate at the Royal Society on Replication and Reproducibility in Psychological Science.
We’ll come to that debate, but let’s start with a closer look at these crisis sub-types. Typically, the “Harrys” propose greater restriction on what we can count as the acceptable use of inferential statistics. The system isn’t broken, but it needs to be tightened up. Their solution to a crisis in statistics: more statistics! My instinct is that increased restriction rarely fosters opportunity. With some prominent hind-sighters one can almost hear every rung of the ladder clicking into place as it disappears.
Then there are the white knights, the Bruces. From what I can see the Bruces, like the Harrys, propose greater restriction in the name of “open science”. (I object to the term, ‘open science’, in this context. No one, least of all me, is going to object to greater openness. But the dominant formulation of the drive for more ‘open science’ suggests something is closed. If there’s widespread misuse of statistics in psychology I don’t believe that’s happening because we’re not open. There may be ignorance, but there’s no collective cover-up here.) In this vein, there’s increasing traction around the idea of using registered reports in journals (perhaps tying in with funding bodies) to address reproducibility concerns. The idea is that methods and proposed analyses are submitted to a journal prior to the research being conducted on the understanding that the research will, ultimately, be published regardless of the research findings. Several journals have adopted the procedure.
Has this really been thought through? I don’t believe it has. The registered reports approach closes down the opportunities for a counter-intuitive result because, like it or not, exploratory analyses will never have the status of “proper”, pre-registered results and thus meaningful innovation is stifled. From a practical perspective there’s a worry is that the registered reports approach will either create huge additional costs to conducting science or will lead to a proliferation of speculative submissions, probably from larger, established labs. Registered reports will create demand that far outstrips the already limited supply of potential manuscript reviewers.
The approach may work in some fields, but my chief concern, as a psychologist is that registered reports will inevitably inhibit diversity in the field. There are two ways this will happen. First, few labs and fewer researchers will ever have the funds, time, or resources to undertake the sort of piloting, planning, and multiple testing suggested by the registered reports approach. As a journal editor, I cannot imagine that the approach would increase the submissions from developing nations or truly diverse researchers and groups. Second, what about other research methods? There’s nothing wrong with a good case study and, dare I say it, an intelligent qualitative analysis can generate ideas that no experiment can. It’s as sad as it is ironic that the casualties of the replication crisis are likely to be precisely those researchers who weren’t part of the problem in the first place! Psychology deserves a far more radical analysis of its evidence base than just flapping around over p-values.
Anyhow, it feels premature to be setting new, rigid statistical rules because there’s currently little consensus on whether we should refine our use of p-values, move to Bayesian techniques, or re-explore the use of effect sizes and confidence intervals. Sure, let’s carefully and calmly discuss what counts as evidence in our field. Let’s have a measured and inclusive debate. That’s the way any discipline develops. But don’t panic. And in the meantime, my advice? “Keep Calm and do an ANOVA”.