Improving replicability using interaction with laboratories: a multi-lab experimental assessment

Experimentation with mouse and rat models has become a central strategy for discovering mammalian gene function, and for preclinical testing of pharmacological treatments, yet the utility of any findings critically depends on their replicability in other laboratories. In previous publications we proposed a statistical approach for estimating the inter-laboratory replicability of novel discoveries made in a single laboratory. We demonstrated that previous phenotyping results from multi-lab databases can be used to derive a Genotype-by-Lab (GxL) adjustment factor to greatly enhance the replicability of the single-lab findings, for similarly measured phenotypes, even before making the effort of replicating these finding in additional laboratories. This demonstration, however, still raised several important questions that could only be answered by an additional large-scale prospective experiment: 1) Does GxL-adjustment work in single-lab experiments that were not intended to be standardized across laboratories, and with genotypes that were not included in the previous experiments? And 2) Can it be used to adjust the results of pharmacological experiments? We investigated these questions by attempting to replicate, across three laboratories, results from five single-lab studies in the Mouse Phenome Database (MPD), offering 212 comparisons, including 60 involving a pharmacological treatment: 18 mg/kg/day fluoxetine. In addition, we define and use a dimensionless GxL factor, by dividing the GxL variance by the standard deviation between animals within groups, as a more robust vehicle to transfer the adjustment from the multi-lab analysis to very different labs and genotypes. For genotype comparisons, GxL-adjustment reduced the rate of non-replicable discoveries from 60% to 12%, for the price of reducing the power to make replicable discoveries from 87% to 66%. In absolute numbers, the adjustment prevented 23 non-replicable discoveries for the price of missing only three replicated ones. Tools and data needed for deployment of this method across other mouse experiments are publicly available in MPD. Our results further point at some phenotypes as more prone to produce non-replicable results, while others, known to be more difficult to measure, are as likely to produce replicable results (once adjusted) such as the physiological measure, body weight.

[1]  N. Shomron,et al.  Novel ADNP Syndrome Mice Reveal Dramatic Sex-Specific Peripheral Gene Expression With Brain Synaptic and Tau Pathologies , 2021, Biological Psychiatry.

[2]  B. Voelkl,et al.  Reliability of common mouse behavioural tests of anxiety: A systematic review and meta-analysis on the effects of anxiolytics , 2021, Neuroscience & Biobehavioral Reviews.

[3]  I. Gozes The ADNP Syndrome and CP201 (NAP) Potential and Hope , 2020, Frontiers in Neurology.

[4]  F. Fernández‐Avilés,et al.  CIBER-CLAP (CIBERCV Cardioprotection Large Animal Platform): A multicenter preclinical network for testing reproducibility in cardiovascular interventions , 2019, Scientific Reports.

[5]  R. Irizarry ggplot2 , 2019, Introduction to Data Science.

[6]  Stephen C. Grubb,et al.  Mouse Phenome Database: a data repository and analysis suite for curated primary mouse phenotype data , 2019, Nucleic Acids Res..

[7]  J. J. Higgins,et al.  From One Environment to Many: The Problem of Replicability of Statistical Inferences , 2019, 1904.10036.

[8]  G. Hacohen-Kleiman,et al.  Activity-dependent neuroprotective protein deficiency models synaptic and developmental phenotypes of autism-like syndrome , 2018, The Journal of clinical investigation.

[9]  Steve D. M. Brown,et al.  High-throughput mouse phenomics for characterizing mammalian gene function , 2018, Nature Reviews Genetics.

[10]  Damian Smedley,et al.  The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation , 2018, Conservation Genetics.

[11]  Bernhard Voelkl,et al.  Reproducibility of preclinical animal research improves with heterogeneity of study samples , 2018, PLoS biology.

[12]  Yoav Benjamini,et al.  Addressing reproducibility in single-laboratory phenotyping experiments , 2017, Nature Methods.

[13]  J. Ioannidis,et al.  Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature , 2017, PLoS biology.

[14]  William Valdar,et al.  Ovariectomy results in inbred strain-specific increases in anxiety-like behavior in mice , 2016, Physiology & Behavior.

[15]  Robert W. Williams,et al.  Reproducibility and replicability of rodent phenotyping in preclinical studies , 2016, Neuroscience & Biobehavioral Reviews.

[16]  Henrik Westerberg,et al.  Analysis of mammalian gene function through broad based phenotypic screens across a consortium of mouse clinics , 2015, Nature Genetics.

[17]  I. Cockburn,et al.  The Economics of Reproducibility in Preclinical Research , 2015, PLoS biology.

[18]  Jonathan W Schooler,et al.  Turning the Lens of Science on Itself , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[19]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[20]  C. Cowan,et al.  NAP (davunetide) rescues neuronal dysfunction in a Drosophila model of tauopathy , 2013, Molecular Psychiatry.

[21]  Steve D. M. Brown,et al.  Mouse large-scale phenotyping initiatives: overview of the European Mouse Disease Clinic (EUMODIC) and of the Wellcome Trust Sanger Institute Mouse Genetics Project , 2012, Mammalian Genome.

[22]  Michael F. Green,et al.  Effect of the neuroprotective peptide davunetide (AL-108) on cognition and functional capacity in schizophrenia , 2012, Schizophrenia Research.

[23]  B. H. Miller,et al.  Evaluating genetic markers and neurobiochemical analytes for fluoxetine response using a panel of mouse inbred strains , 2011, Psychopharmacology.

[24]  Patrick F. Sullivan,et al.  ANTIPSYCHOTIC-INDUCED VACUOUS CHEWING MOVEMENTS AND EXTRAPYRAMIDAL SIDE-EFFECTS ARE HIGHLY HERITABLE IN MICE , 2010, The Pharmacogenomics Journal.

[25]  Joachim Kunert,et al.  Systematic variation improves reproducibility of animal experiments , 2010, Nature Methods.

[26]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[27]  P. Glasziou,et al.  Avoidable waste in the production and reporting of research evidence , 2009, The Lancet.

[28]  T. Hothorn,et al.  Simultaneous Inference in General Parametric Models , 2008, Biometrical journal. Biometrische Zeitschrift.

[29]  O. Touloumi,et al.  Activity-Dependent Neuroprotective Protein Snippet NAP Reduces Tau Hyperphosphorylation and Enhances Learning in a Novel Transgenic Mouse Model , 2007, Journal of Pharmacology and Experimental Therapeutics.

[30]  D. Reed,et al.  Forty mouse strain survey of water and sodium intake , 2007, Physiology & Behavior.

[31]  Anat Sakov,et al.  Genotype-environment interactions in mouse behavior: a way out of the problem. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Anat Sakov,et al.  New replicable anxiety-related measures of wall vs center behavior of mice in the open field. , 2004, Journal of applied physiology.

[33]  Anat Sakov,et al.  The dynamics of spatial behavior: how can robust smoothing techniques help? , 2004, Journal of Neuroscience Methods.

[34]  J. Crabbe,et al.  Strain differences in three measures of ethanol intoxication in mice: the screen, dowel and grip strength tests , 2003, Genes, brain, and behavior.

[35]  Yoav Benjamini,et al.  SEE locomotor behavior test discriminates C57BL/6J and DBA/2J mouse inbred strains across laboratories and protocol conditions. , 2003, Behavioral neuroscience.

[36]  Ilan Golani,et al.  SEE: a tool for the visualization and analysis of rodent exploratory behavior , 2001, Neuroscience & Biobehavioral Reviews.

[37]  J. Crabbe,et al.  Genetics of mouse behavior: interactions with laboratory environment. , 1999, Science.

[38]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[39]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[40]  Emily S. Sena,et al.  Bringing rigour to translational medicine , 2014, Nature Reviews Neurology.

[41]  I. Gozes,et al.  Learning and sexual deficiencies in transgenic mice carrying a chimeric vasoactive intestinal peptide gene , 2007, Journal of Molecular Neuroscience.