The replication of initial results is a critical part of the scientific process. In preclinical biomedical research, it is common practice to conduct exact replications with the same sample sizes as those used in the initial experiments. In this article, the authors point out that such replication attempts have lower probability of replication than is generally appreciated. Indeed, in the common scenario of an effect just reaching the “statistical significance” level of p < 0.05, the statistical power of the replication experiment (assuming the same effect size) is approximately 50% – in essence, a coin toss. Accordingly, Piper and colleagues used the provocative analogy of “replicating” a neuroprotective drug animal study with a coin flip to highlight the need for larger sample sizes in replication experiments. Additionally, this article also discussed: a) the probability of obtaining a significant p value in a replication experiment, b) the variability of p values as well as c) pitfalls of simple binary significance testing in both initial preclinical experiments and replication studies with small sample sizes were discussed in this article.
Importantly, the authors also explain the need for an effect size estimate in both initial and replication experiments and highlight that researchers are well advised to focus their research around the central question: “What is the effect (size)?” instead of the binary “Is there a statistically significant effect?”

LINK