Towards Automated Hypothesis Testing in Neuroscience

Scientific data generation in the world is continuous. However, scientific studies once published do not take advantage of new data. In order to leverage this incoming flow of data, we present Neuro-DISK, an end-to-end framework to continuously process neuroscience data and update the assessment of a given hypothesis as new data become available. Our scope is within the ENIGMA consortium, a large international collaboration for neuro-imaging and genetics whose goal is to understand brain structure and function. Neuro-DISK includes an ontology and framework to organize datasets, cohorts, researchers, tools, working groups and organizations participating in multi-site studies, such as those of ENIGMA, and an automated discovery framework to continuously test hypotheses through the execution of scientific workflows. We illustrate the usefulness of our approach with an implemented example.

[1]  Ian J. Deary,et al.  Is there association between APOE e4 genotype and structural brain ageing phenotypes, and does that association increase in older age in UK Biobank? (N = 8,395) , 2017, bioRxiv.

[2]  C. Jack,et al.  Tracking pathophysiological processes in Alzheimer's disease: an updated hypothetical model of dynamic biomarkers , 2013, The Lancet Neurology.

[3]  João Gama,et al.  A survey on learning from data streams: current and future trends , 2012, Progress in Artificial Intelligence.

[4]  Paul T. Groth,et al.  Wings: Intelligent Workflow-Based Design of Computational Experiments , 2011, IEEE Intelligent Systems.

[5]  Paul M. Thompson,et al.  Automatic Generation of Portions of Scientific Papers for Large Multi-Institutional Collaborations Based on Semantic Metadata , 2017, SemSci@ISWC.

[6]  Ken E. Whelan,et al.  The Automation of Science , 2009, Science.

[7]  Yolanda Gil,et al.  Towards Continuous Scientific Data Analysis and Hypothesis Evolution , 2017, AAAI.

[8]  Yolanda Gil,et al.  Organic Data Publishing: A Novel Approach to Scientific Data Sharing , 2012, LISC@ISWC.

[9]  C. Jack,et al.  MRI of hippocampal volume loss in early Alzheimer's disease in relation to ApoE genotype and biomarkers , 2008, Brain : a journal of neurology.

[10]  Y. Gil,et al.  Automated Hypothesis Testing with Large Scientific Data Repositories , 2016 .

[11]  Ross D. King,et al.  Representation of probabilistic scientific knowledge , 2013, J. Biomed. Semant..

[12]  Thomas E. Nichols,et al.  The ENIGMA Consortium: large-scale collaborative analyses of neuroimaging and genetic data , 2014, Brain Imaging and Behavior.

[13]  Benjamin S Aribisala,et al.  Novel genetic loci associated with hippocampal volume , 2017, Nature Communications.

[14]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[15]  Ross D King,et al.  An ontology of scientific experiments , 2006, Journal of The Royal Society Interface.

[16]  Nick C Fox,et al.  Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease , 2013, Nature Genetics.

[17]  Emily L. Dennis,et al.  ENIGMA and global neuroscience: A decade of large-scale studies of the brain in health and disease across more than 40 countries , 2019, Biological Psychiatry.