The July issue of the Newsletter mentioned a recent publication by Daniel Benjamin and his colleagues who have analyzed the ability of cancer researchers to judge whether selected preclinical reports would be reproduced or not. On average, researchers forecasted a 75% probability of replicating the statistical significance and a 50% probability of replicating the effect size, yet none of these studies successfully replicated on either criterion (for the 5 studies with results reported).
This is not the first time that such “prediction” exercise is conducted. In the analysis based on 44 studies published in prominent psychology journals and replicated within the Reproducibility Project: Psychology, scientists’ individual forecasts were also not too accurate and Anna Dreber and colleagues have suggested that specialized tools and methods such as Prediction Markets can be more accurate when assessing the reproducibility of published scientific results.
Analyses as described above are made possible by having access to real data (i.e. predictions are analyzed against the real data generated by the Reproducibility Project). And these comparisons show that predictions may be of varying accuracy.
The field of meta-analysis provides several tools for estimating bias towards publications with positive results. For example, Tsilidis and colleagues conducted a meta-analysis of 4,445 data sets in the CAMARADES (Collaborative Approach to Meta-analysis and Review of Animal Data in Experimental Studies) database and revealed that the observed rate of nominally significant results was nearly two times higher than the rate that would be expected based on the effect size of the most precise studies (i.e. those with the smallest standard errors) taken to estimate the plausible effects. How accurate is this estimate? Does it suggest that studies with statistically non-significant results stay unpublished?
One could once again apply analytical tools such as trim-and-fill analysis to come up with an estimate of how many studies are not published and assume that they are not published because of the negative nature of the data.
Alternatively, one could try to identify sources of information about real unpublished data. Here, an interesting example was provided by Eugene O’Boyle and his colleagues who have analyzed dissertations in the field of management research and have demonstrated that, from dissertation to journal article, the ratio of supported to unsupported hypotheses more than doubled. Management research may seem far away from biomedical research – however, the Nature 2016 survey by Monya Baker has shown that the reproducibility crisis is shared by all disciplines of science. Furthermore, the O’Boyle paper discussed exactly the same problems, their roots and used the same language that we use in the biomedical area when analyzing reproducibility issues.
Given that, in most countries, dissertations are published online, this presents an excellent opportunity to support arguments about limited robustness of published data and publication bias by hard-core empirical evidence.