The seminal paper by John Ioannidis entitled “Why Most Published Research Findings Are False” (2005 PLoS Medicine 2: e124), contains some statements that are easy to understand and follow, e.g. smaller sample sizes indicate that research findings are less likely to be true. However, there are others that, in spite of being very well presented and discussed in this highly cited paper, are rather difficult to follow and implement into research practice. For example, it has been argued for years that it is important to estimate how likely it is that a phenomenon is real when considering the general knowledge in the area previous to the study (pre-study probability). This Bayesian thinking can be convincingly illustrated (Nuzzo (2014) Nature 506: 150) but, for most biomedical scientists and many research situations, it is difficult to implement, since this pre-study probability is difficult to estimate.
Ioannidis (2005, Table 4) provides some rough examples of the ratios between true and non-true relationships for different study types, but this is not always helpful if a scientist wants to apply it to his/her particular research plans. Nevertheless, Table 4 in the article by Ioannidis (2005) can be used as a first starting point and, by analyzing relevant examples, one may come up with a set of formal criteria that could help scientists to estimate the pre-study odds for their own projects. We present this case study as an example that can stimulate such discussion:
Aging is a slow process that likely involves multiple interconnected and very complex mechanisms. This is a statement that, for most people not having specific hypotheses about aging mechanisms, would sound rather reasonable to accept. Thus, how likely is it that a single protein given over a fairly short period of time will reverse the signs of aging?
Three papers published in highly respectful journals (Loffredo et al (2013) Cell; Katsimpardi et al (2014) Science; Sinha et al (2014) Science), presented data suggesting that a four-week long treatment with a protein called GDF11 makes the heart, skeletal muscle and brain of old mice look and perform like young ones. It is no surprise that other labs tried to follow up these publications and have come to conflicting evidence (summarized at: http://www.sciencemag.org/news/2015/10/antiaging-protein-real-deal-harvard-team-claims). First, the quality of the research tools used (antibody specificity) has been questioned by a study arguing that GDF11 is actually accumulating with age and inhibits muscle regeneration (Egerman et al (2015) Cell Metabolism). Second, the originator lab re-ran the study using a more appropriate design, and found that the GDF11 protein treatment affects heart muscle equally in both old and young mice. Third, these are single-dose studies, although, technically speaking, they fall under the jurisdiction of pharmacology, which would demand a dose-effect analysis. As a consequence, discrepant results are attributed to the fact that one lab is apparently testing higher doses than the other, and that the “therapeutic window” for a desired effect seems to be too narrow (so that only the “lucky” lab is working with the right dose and therefore reporting positive effects).
Besides illustrating the importance of proper validation of research tools and investment into optimal study design, this case study supports the need to take all statements in the paper by Ioannidis (2005) seriously: it is of crucial importance to weight pre-study odds against the scientific excitement about the obtained study results.