Decision-enabling studies: Robust enough to support critical decisions?

Decision-making is an essential function of any company and determines long-term success. However, what are the key factors that influence decision-enabling in the pharmaceutical industry leading to new products and approved drugs?

To analyze the importance of Good Research Practice (GRP) standards as well as the quality and validity of data for this decision process we have conducted an analysis of 12 drug discovery projects (preclinical up to clinical candidate selection) that were licensed over the past two years by three EU pharma companies. There were a total of 26 studies that were identified as ‘critical’ (consensus decision based on discussions with representatives of the licensee companies).

Post-licensing analysis of these ‘critical’ studies indicated that not every study was designed in a way that is consistent with its role in decision making. Only around one third of all studies were properly blinded (see Figure) and only one quarter contained well-defined, pre-specified endpoints, which can significantly reduce bias and false experimental outcomes compared to post-hoc or secondary endpoint analyses.

Based on this analysis, PAASP estimates that at least 30% of early-stage innovative drug discovery projects critically depend on data that do not meet minimum quality criteria.

It is well possible that decisions to license projects are often based on factors other than the quality of research data, such as time considerations, organizational and cultural influences, subjective and personal considerations or political influences.

However, given the overall decreasing drug R&D productivity (with pre-clinical data quality as a major contributing factor), decisions during drug development as important organizational elements should not neglect the assessment of data quality and integrity. Instead, the question should be addressed whether or not GSP standards were implemented in decision-enabling studies. Structured and informed decisions will help avoiding unnecessary terminations of drugs in Phase II/III development.

Scientists believed a whiff of the bonding hormone Oxytocin could increase trust between humans. Then they went back and checked their work… 

Over the last two decades, the neuropeptide Oxytocin (OT) has been studied extensively and many articles have been published about its role in humans’ emotional and social lives, e.g. increasing trust and sensitivity to others’ feelings. Even a TED talk has been recorded (Trust, morality – and oxytocin?) with over 1.4 million viewers.
The human trials conducted were based on early animal studies, where a critical manipulation of the OT system was translated into behavioral phenotypes affecting social cognition, bonding and individual recognition.
However, some recent publications question the sometimes bewildering evidence for the role of OT in influencing complex social processes in humans, and failed to reproduce some of the most influential studies conducted. Furthermore, no elevated cerebrospinal fluid (CSF) OT levels could be detected 45 min after administration, which represents the time window at which most behavioral tasks took place (Striepens et al., 2013). CSF OT concentrations were increased after 75 minutes, indicating that OT pharmacokinetics is not fully understood. Moreover, it is still unclear whether the usual doses administered in the field (between 24 and 40 IU) can indeed deliver enough OT to the brain in order to produce significant changes in individuals (Leng et al., 2016).
This ultimately leads to the following question: ‘If the published literature on the OT effects does not reflect the true state of the world, how has the vast behavioral OT literature accumulated (Lane et al., 2016)?’
Several possible scenarios and reasons are currently discussed and analyzed amongst OT researchers, demonstrating the crucial importance of implementing Good Research Practice standards, proper study design and a priori statistical power calculations:Power analysis:
A meta-analysis of the effects of OT on human behavior found that the average OT study has a statistical power of 16% for healthy individuals and a median sample size of 49 individuals. For clinical trials the statistical power was even lower (12%), given a median sample size of 26 individuals (Walum et al., 2016) .
Hence, OT studies in humans are dangerously underpowered, as 80% is normally considered the standard for minimal adequate statistical power. Even for studies with the largest effect and sample sizes (N = 112), the statistical power was lower than 70%. In order to achieve 80% power for the average effect size reported, a sample size of 352 healthy individuals would be needed (310 individuals for clinical trials).
Statistical power is the probability that a test will be able to reject the null hypothesis considering a true relation with a given effect size. In other words, replication attempts of true positive OT studies (with the same sample size) would fail up to 88% of the time considering the false negative rate of 84% or 88%, respectively. To further aggravate the problem, the observed effect size in underpowered studies is likely to be highly exaggerated, a phenomenon also known as “the winner’s curse”.
In addition, this meta-analysis also demonstrated that the positive predictive value (PPV) of those studies (using information on power, the pre-study odds and the alpha level) is low. Therefore, it was concluded that most of the reported positive findings in this field are likely to be false positives (Walum et al., 2016).Publication bias:
Almost all studies (29 out of 33), which were investigated as part of the meta-analysis (Walum et al., 2016), reported at least one positive result (p-value below 0.05). This huge excess of statistically significant findings clearly points towards a phenomenon referred to as the ‘file-drawer effect’ or publication bias suggesting that there could be a substantial amount of unpublished negative or inconclusive findings.
In an admirable and applaudable attempt to investigate if there is a file drawer problem in OT research, Anthony Lane at Catholic University of Louvain started to analyze all studies that were performed in his laboratory from 2009 until 2014 on a total of 453 subjects (Lane et al., 2016). Indeed, he found a statistically significant effect of OT for only one out of 25 tasks. This large proportion of ‘unexpected’ null findings, which were never published after they were conducted, raised concerns about the validity of what is known about the influence of OT on human behaviors and cognition. A. Lane therefore states that ‘our initial enthusiasm for OT has slowly faded away over the years and the studies have turned us from ‘believers’ into ‘skeptics’.
This process of publication bias is further supported by the current publication culture and the strong tendency of journals to favor publishing results that confirm hypothesis and neglect unconvincing data.Study design:
In addition to publication bias, the excess of significant effects of OT may also be the result of methodological, measurement or statistical artefacts: A. Lane’s laboratory also reported a massive use of ‘between-subject’ designs of relatively small sample size (around 30 individuals per study), which carries the risk of attributing effects to OT that are in fact generated by various unobservable factors, e.g. personality of participants (Lane et al., 2016).
Furthermore, Lane et al. failed twice to replicate their own previous study (Lane at al., 2015), which showed a powerful effect of OT increasing trusting behavior of study members. Notably, in the original study, OT administration followed a single blind procedure, where the subject is blind to the treatment condition but the experimenter is not, introducing the risk that the experimenter might unconsciously act differently and thereby influencing the subjects’ behavior to confirm the researcher’s hypothesis (unconscious behavioral priming). Both subsequent replication attempts were conducted in a double-blinded manner!

Importantly, the statistical and methodological limitations discussed here are not specific to the OT field and also directly affect other areas of biomedical research. Nevertheless, a systematic change in research practices and in the OT publication process is required to increase the trustworthiness and integrity of the data and to reveal the true state of OT effects. The adherence to detailed Good Research Practices (e.g. a priori power calculations and accurate blinding procedures) and a transparent reporting of methods and findings should therefore be strongly encouraged.

Building a nearshoring collaboration: Success story

Starting to work together with someone you do not know is often associated with some risk-taking if this person is outside your regular circle of collaborators and is off the mainstream of usual partners. We often observe that pharma companies are much more open to collaboration with service providers in the Far East countries than in Eastern Europe. It is clear that there is a major trend to work with partners in China (and, to some extent, India) and the costs benefits appear obvious (or, perhaps better to say, used to be obvious). There is also an established market of many hundreds, if not thousands, of companies in Asia that offer a variety of services and starting a collaboration is not much different from going to a shop and picking up a product that looks like what you need.

To build a collaboration with a lab in Eastern (or Central) European countries requires a different approach. The first and the most critical step is to obtain initial experience that can help to reveal all the advantages of nearshoring (outsourcing to neighbours) and to convince internal decision makers to go for a larger project.

As an example of such a first step, we would like to refer to a project that we have supported. A major pharma company active in the fields of Immunology and Neuroscience was running internally a variety of drug discovery projects – each one of them requiring support of anatomy and histology groups. With all main research facilities of this company located in a Western European country, it turned out to be problematic to engage highly experienced and highly qualified internal research staff to do large volumes of routine work (essentially preparing the tissue for histological analysis). An additional challenge was the reluctance of upper management to increase headcount in R&D. And, not surprisingly, reducing histology support of the projects, another option to reduce the workload for the department, was fiercely resisted by project leaders. The histology department was clearly regarded as a success critical project bottleneck, when an interesting alternative presented itself.

Looking for solutions to eliminate this success-critical project bottleneck, we have identified a lab in Poland that employed people with highly developed histology skills, had at least 50% of their resources unused (due to limited funding) and a high motivation to work. Two people from this lab were sent to the pharma company’s research center for a one-month training program. Upon their return, this lab received basic tools necessary to conduct the work and the tissue samples to process. We have helped both sides to address all legal aspects and to negotiate the terms that were acceptable for both sides.

We refer to this collaboration as a success story because within a fairly short period of time this project started to grow beyond the originally intended focus on outsourcing of routine work. Once the pharma company managers got the first experience, there were other nearshoring projects of increasing complexity and diversity.