Teemu D. Laajala1,2
1: University of Turku, Turku (Finland), Department of Mathematics and Statistics
2: University of Colorado, Denver (CO, US), Anschutz Medical Campus, Department of Pharmacology

MANILA (MAtched ANimaL Analysis) is a novel web-based tool that leverages predictive baseline variables and incorporates complex baseline characteristics for allocating treatment groups in preclinical intervention studies. The need for MANILA was motivated by the challenges in reproducibility and transparency reported in preclinical cancer research (Laajala TD, et al.Aittokallio T, et al.), and from an internal need to standardize protocols and provide a generalizable tool also for non-bioinformaticians. MANILA provides not only an interactive web-based interface for its use, but also an underlying more extensive R-package hamlet (hierarchical optimal matching and machine learning toolbox), including open-source R functionality that is most relevant for preclinical experimentation. For experts interested in the wider use of hamlet, users are encouraged to explore the R-package on CRAN (Central R Archive Network), where it is extensively documented and exemplified.

MANILA identifies animal subgroups based on a selected dissimilarity metric, so that predictive baseline characteristics of the animals portrait a similar prognosis. These subgroups – dubbed submatches – are evenly divided into blinded intervention arms in a stochastic manner, optimized by a genetic algorithm. One of MANILA’s strengths lies in the blinding of the study arms similar to how clinical trials are conducted. All of the allocated intervention arms are asymptotically similar, so any group label can be used as the control or comparison group. MANILA provides a highly versatile range of options for adjusting various parameters in the matching procedure, including but not limited to distance or dissimilarity metrics, scaling, and genetic algorithm parameters.

Our web-tool incorporates tools that help non-expert users in inputting and modifying their raw data, through e.g. data transformations, inclusion and exclusion of variables or observations, and diagnostics. Such additional tools are complemented by a wide variety of visualization tools such as heatmaps, hierarchical clustering, multidimensional scatterplots, and boxplots. Furthermore, MANILA offers mixed-effects models for testing differences in treatment effects after the interventions have been conducted. The tool offers the possibility to use the original grouping of submatches for increased power in identifying differences between animals that had a similar prognosis based on the baseline variables. For downstream analyses, power and longitudinal regression curves can be generated.

Importantly, MANILA allows power calculations based on a representative simulated dataset or a pilot study. In contrast to providing rather straight-forward expert-curated expected effect sizes and variance, regression-model power calculations are often more complex and unintuitive, due to both experimental and modelling considerations. To this end, stratified bootstrapping (sampling with replacement) is offered for sampling groups of observations. As multiple longitudinal observations are nested within an individual, this scheme samples individuals correctly for mixed-effects modeling and also considers complex phenomena such as right-censoring due to moribund animals. As such effects are difficult or impossible to provide to power calculations as mere parameter estimates, this simulation approach leverages computational power to utilize existing or representative data for future studies with similar setup, and hence increases research reproducibility and reliability.

The MANILA tool comes with a step-by-step user guide and the tool is hosted at the University of Turku:https://biomedportal.utu.fi/utu-apps/Rvivo/

Alternatively, the user may download the R Shiny web app and run it locally, for example, to increase the speed of the sampling based power calculations. The underlying R-package hamlet is open source and freely available for expert users, and expands on functionality that goes beyond the graphical interface.

Figure 1: Genetic algorithm is utilized to identify animals with a similar prognosis before interventions for randomized treatment arm allocation.
Figure 2: Mixed-effects modeling is provided as a regression analysis tool for identifying intervention effects or conducting power calculations based on representative datasets.