Conflict of interest – Education needed!

A “Conflict of Interest (CoI)” is a very special situation of which we are usually reminded about when a famous scientist gets criticized – most often in connection to highly publicized areas of science such as CRISPR or Covid-19.

For many scientists in Academia, awareness about CoI issues is limited to what is found in the journals’ Instructions for Authors. For example, for the European Journal of Neuroscience it states:

“EJN requires that all authors disclose any potential sources of conflict of interest. Any interest or relationship, financial or otherwise, which might be perceived as influencing an author’s objectivity is considered a potential source of conflict of interest. The existence of a conflict of interest does not preclude publication in this journal. The absence of conflicts of interest should also be stated. It is the responsibility of the corresponding author to ensure this policy is adhered to.”

Those who would like to understand whether there is something to disclose will have to do some research of their own and search for relevant guidance. One credible source of information is the 2009 book “Conflict of Interest in Medical Research, Education, and Practice” published by the Institute of Medicine.  Those who have time and patience to read its 392 pages will be equipped with all the essential knowledge and will understand how to judge whether there is a CoI or not:

“Conflicts of interest are defined as circumstances that create a risk that professional judgments or actions regarding a primary interest will be unduly influenced by a secondary interest. Primary interests include promoting and protecting the integrity of research, the quality of medical education, and the welfare of patients. Secondary interests include not only financial interests—the focus of this report—but also other interests, such as the pursuit of professional advancement and recognition and the desire to do favors for friends, family, students, or colleagues. Conflict of interest policies typically focus on financial gain because it is relatively more objective, fungible, and quantifiable. Financial gain can therefore be more effectively and fairly regulated than other secondary interests.

The severity of a conflict of interest depends on (1) the likelihood that professional decisions made under the relevant circumstances would be unduly influenced by a secondary interest and (2) the seriousness of the harm or wrong that could result from such an influence. The likelihood of undue influence is affected by the value of the secondary interest, its duration and depth, and the extent of discretion that the individual has in making important decisions.”

Those who would search such books for quick guidance may find the following table to be useful: 

Table 1: Candidate List of Categories of Financial Relationships with Industry to be Disclosed

Research grants and contracts
Consulting agreements
Participation in speakers bureaus
Intellectual property, including patents, royalties, licensing fees
Stock, options, warrants, and other ownership (excepting general mutual funds)
Position with a company:
– Company governing boards
– Technical advisory committees, scientific advisory boards, and marketing panels
– Company employee or officer, full or part time
Authorship of publications prepared by others
Expert witness for a plaintiff or defendant
Other payments or financial relationships

Having such summary tables with examples are indeed very useful as they may support the decision making in some cases. Yet, one should not view such tables as checklists that include all potential examples and there is still the need to understand what may constitute a CoI and to treat every situation individually.

Indeed, the title of the table above focuses on relationships with industry and it may therefore lead to misunderstandings and confusion that these categories are relevant only for existing relationships with industry.

In a recent example, an academic discovery originated from research was funded by a non-profit patient advocacy group with apparently no industry involvement either at the time when a patent application was filed or at a later time point when a related scientific publication appeared. However, even in such a case, the author should have disclosed the existence of a patent application and inform the reader about a potential conflict of interest. This is why conflict of interest policies generally emphasize preventive measures such as transparent disclosures.

Biological vs technical replicates: Now from a data analysis perspective

We have discussed this topic several times before (HERE and HERE). There seems to be a growing understanding that, when reporting an experiment’s results, one should state clearly what experimental units (biological replicates) are included, and, when applicable, distinguish them from technical replicates.

In discussing this topic with various colleagues, it became obvious to us that there is no clarity on best analytic practices and how to take technical replicates into analysis.

We have approached David L McArthur (at the UCLA Department of Neurosurgery), an expert in study design and analysis, who has been helping us and the Preclinical Data Forum on projects related to data analysis and robust data analysis practices.

A representative example that we wanted to discuss includes 3 treatment groups (labeled A, B, and C) with 6 mice per group and 4 samples processed for each mouse (e.g. one blood draw per mouse separated into four vials and subjected to the same measurement procedure) – i.e. a 3X6X4 dataset.

The text below is based on Dave’s feedback.  Note that Dave is using the term “facet” as an overarching label for anything that contributes to (or fails to contribute to) interpretable coherence beyond background noise in the dataset, and the term “measurement” as a label for the observed value obtained from each sample (rather than the phrase “dependent variable”  often used elsewhere).

Dave has drafted a thought experiment supported by a simulation.  With a simple spreadsheet using only elementary function commands, it’s easy to build a toy study in the form of a flat file representing that 3X6X4 system of data, with the outcome consisting of one measurement in each line of a “tall” datafile, i.e., 72 lines of data with each line having entries for group, subject, sample, and close-but-not-quite-identical measurement (LINK). But, for our purposes, we’ll insert not just measurement A but also measurement B on each line — where we’ve constructed measurement B to differ from measurement A in its variability but otherwise to have identical group means and subject means.  (As shown in Column E, this can be done easily: take each A value, jitter it by uniform application of some multiplier, then subtract out any per-subject mean difference to obtain B.)  With no loss of meaning, in this dataset measurement A has just a little variation from one measurement to the next within a given subject, but because of that multiplier, measurement B has a lot of variation from one measurement to the next within a given subject.

A 14-term descriptive summary shows that using all values of measurement A, across groups, results in:

robust min0.30000.90001.5000
hdQ: 0.250.63801.23801.8380… (25th quantile, the lower box bound of a boxplot)
hdQ: 0.751.06201.66202.2620… (75th quantile, the upper box bound of a boxplot)
robust max1.40002.00002.6000
Huber mu0.85001.45002.0500
Shapiro p0.97030.97030.9703

while, using all values of  measurement B, across groups, results in:

mean0.85001.45002.0500<– identical group means
SD5.71315.71315.7131<– group standard deviations about 20 times larger
robust min-6.9000-6.3000-5.7000
hdQ: 0.25-4.2657-3.6657-3.0657
median0.85001.45002.0500<– identical group medians
hdQ: 0.755.96576.56577.1657
robust max8.60009.20009.8000
skew-0.0000-0.0000-0.0000<– identical group skews
kurtosis-1.3908-1.3908-1.3908<– greater kurtoses, no surprise
Huber mu0.85001.45002.0500<– identical Huber estimates of group centers
Shapiro p0.00780.00780.0078<– suspiciously low p-values for test of normality, no surprise

The left panel in the image below results from simple arithmetical averaging of that dataset’s samples from each subject, with the working dataframe reduced by averaging from 72 lines to 18 lines.  It doesn’t matter here whether we now analyze measurement A or measurement B, as both measurements inside this artificial dataset generate the identical 18-line dataframe, with means of 0.8500, 1.4500, and 2.0500 for groups A, B and C respectively.  Importantly, the sample facet disappears altogether, though we still have group, mouse, measurement and noise.  The simple ANOVA solution for the mean measures shows “very highly significant” differences between the groups.  But wait.

The center panel uses all 72 available datapoints from measurement A.  By definition that’s in the form of a repeated-measures structure, with four non-identical samples provided by each subject.  Mixed effects modeling accounts for all 5 facets here by treating them as fixed (group and sample) or random (subject), or as the object of the equation (measurement), or as residual (noise).  The mixed effects model analysis for measurement A results in “highly significant” differences between groups, though those p-values are not the same as those in the left panel.  But wait.

The right panel uses all 72 available datapoints from measurement B.  Again, it’s a repeated-measures structure, but while the means and medians remain the same, now the standard deviations are 20 times larger than those for measurement A, a feature of the noise facet being intentionally magnified and inserted into the artificial source datafile. The mixed effects model analysis for measurement B results in “not-at-all-close-to-significant” differences between groups; no real surprise.

What does this example teach us?

Averaging technical replicates (as in the left panel) and running statistical analyses on average values means losing potentially important information.  No facet should be dropped from analysis unless one is confident that it can have absolutely no effect on analyses.  A decision to ignore a facet (any facet), drop data and go for a simpler statistical test must in any case be justified and defended.

Further recommendations that are supported by this toy example or that the readers can illustrate for themselves (with the R script LINK) are:

  • There is no reason to use the antiquated method of repeated measures ANOVA; in contrast to RM ANOVA, mixed effects modeling makes no sphericity assumption and handles missing data well.
  • There is no reason to use nested ANOVA in this context:  nesting is applicable in situations when one or another constraint does not allow crossing every level of one factor with every level of another factor.  In such situations with a nested layout, fewer than all levels of one factor occur within each level of the other factor.  By this definition, the toy example here includes no nesting.
  • The expanded descriptive summary can be highly instructive (and is yours to use freely). 

And last but not least, whatever method is used for the analysis, the main message that should be lost – one should be maximally transparent about how the data were collected, what were the experimental units, what were the replicates, and what analyses were used to examine the data.

Workshop at ERA-NET Neuron meeting

ERA-NET Neuron organised a workshop research data quality on January 22nd, 2019 in Bonn.

PAASP Team members, Anton Bespalov and Martin Michel, contributed to the program by leading two breakout sessions. Slides used to guide the discussion in these sections can be found below.

Anton Bespalov’s session on “Practical issues in preclinical study design”:

Martin Michel’s session on Data Analysis:

NC3R Workshop: ‘Improving peer review of in vivo research proposals’

On May 9th, 2018, a workshop hosted by the NC3Rs about „Improving peer review of in vivo research proposals“ and included presentations by several of the most prominent figures in the field of preclinical data reproducibility and Good Research Practice. All talks can still be watched online!

The workshop was chaired by Mark Prescott who leads the Policy and Outreach Group at NC3R. One important aim was to raise awareness about the importance of good experimental design and reporting for in vivo research.
The speakers were: Dr. Fances Rawle, Prof Malcolm Macleod, Dr. Kate Button, Prof Hazel Inskip, Dr Nathalie Percie Du Sert and Prof Ulrich Dirnagl.
Frances Rawle gave an overview about the revised guidance for research proposals submitted to the MRC involving animal use. She pointed out the importance for proposals involving animal research to undergo a rigorous review process – especially regarding the numbers of animals used in the proposed experiments. This was picked up by Kate Button who explained the issue related to underpowered studies in much detail. Hazel Inskip continued to advocate the use of statistical matters by focusing on the effect size. Nathalie Percie Du Sert presented the ARRIVE guidelines and the Experimental Design Assistant (EDA), which was developed by the NC3Rs and provides very useful technical support for designing experimental studies.
Overall, these are all very informative presentations and worth watching!

Educational Corner – March 2018

In this Newsletter issue, we introduce a new section – the ‘Educational Corner’.

At regular intervals, we will select a published paper that our readers may use themselves or suggest to their students for self-studying or Journal Club discussions. We will try to select publications that represent interesting areas of science where it may be particularly important to understand how robust the published evidence actually is.
We encourage our readers to send us comments and analyses of these papers and the most interesting feedback will be shared in the next NL issue(s).

For this issue, we chose the paper from Qu et al. (Scientific Reports 2017, Vol 7, Article number: 15725) about the gut microbiota and its role in the mechanisms underlying the antidepressant actions of (R)-ketamine and lanicemine in a chronic social defeat stress model. We realize that this paper may have been published in a journal that does not endorse the highest publication standards (LINK) but nevertheless found the subject exciting and the paper itself a good example to have a discussion with students, postdocs and other colleagues.