In a new paper in the Journal of Experimental & Theoretical Artificial Intelligence, Chris Drummond takes aim at the 'reproducibility movement' which has lately risen to prominence in science. As one of the early advocates for this movement, I was interested to see what Drummond had to say. While I don't find his argument wholly convincing, he does raise some good points.
Drummond begins by summarizing the case for reproducible research as it sees it. The claim is that reproducibility - the ability of other scientists to exactly reproduce and confirm a given result - is central to science. It is further claimed that we can promote reproducibility by requiring authors to submit their data, and their analysis scripts (code), with each publication and that this will, amongst other benefits, help to prevent scientific fraud. Against this, Drummond says that
(1) Reproducibility, at least in the form proposed, is not now, nor has it ever been, an essential part of science. (2) The idea of a single well-defined scientific method resulting in an incremental, and cumulative, scientific process is, at the very best, moot. (3) Requiring the submission of data and code will encourage a level of distrust among researchers and promote the acceptance of papers based on narrow technical criteria. (4) Misconduct has always been part of science with surprisingly little consequence. The public’s distrust is likely more to with the apparent variability of scientific conclusions.
To my mind the most interesting part of this paper falls under Drummond's discussion of point (1), in which he argues that reproducibility is not very important to science, contrary to popular belief. Drummond defines 'reproducibility' as the ability to repeat an experiment as exactly as possible and get the same result: "The aim is to minimise the difference from the first experiment including its flaws, to produce independent verification of the result as reported." But, Drummond points out, scientists are generally not interested in experimental results for their own sake; rather, we use experimental results to test hypotheses. The best way to test a hypothesis is to carry out several different experiments, using different methodologies, to provide convergent evidence. Drummond says that what scientists are really interested in is the 'retestability' of a given hypothesis - not the reproducibility of a given piece of data. Now, I've previously said that reproducibility is fundamental to science:
In my view, replicability is the essence of scientific truth. To say that a certain scientific result is true or valid, is nothing other than to say that someone, who correctly carries out the same methods, would be able to confirm it for themselves. Without the assumption of replicability, scientific papers would become merely historical documents – ‘we did so and so, and we observed so and so, but your mileage may vary.’
However, I actually think that Drummond and I have common ground. Certainly, I agree with Drummond that convergent evidence from multiple different methods is the strongest kind of support for a hypothesis. This is because any given piece of evidence may be misleading, even if it is reproducible - it might be the result of a reproducible artifact. That said, I still think that reproducibility is fundamental. If we have multiple pieces of evidence for a hypothesis, but none of those pieces of evidence are reproducible, the hypothesis would have no support. Reproducibility of the primary evidence must be there first, before we can marshal the evidence to support our models. A model supported by lots of unreproducible evidence is a house built on sand. So I agree with Drummond that reproducibility, alone, is not sufficient to make strong science (I'm not sure if anyone thinks it is), but I stand by my view that it is necessary.