SMILE: systems metabolomics using interpretable learning and evolution

Background Direct link between metabolism and cell and organism phenotype in health and disease makes metabolomics, a high throughput study of small molecular metabolites, an essential methodology for understanding and diagnosing disease development and progression. Machine learning methods have seen increasing adoptions in metabolomics thanks to their powerful prediction abilities. However, the “black-box” nature of many machine learning models remains a major challenge for wide acceptance and utility as it makes the interpretation of decision process difficult. This challenge is particularly predominant in biomedical research where understanding of the underlying decision making mechanism is essential for insuring safety and gaining new knowledge. Results In this article, we proposed a novel computational framework, Systems Metabolomics using Interpretable Learning and Evolution (SMILE), for supervised metabolomics data analysis. Our methodology uses an evolutionary algorithm to learn interpretable predictive models and to identify the most influential metabolites and their interactions in association with disease. Moreover, we have developed a web application with a graphical user interface that can be used for easy analysis, interpretation and visualization of the results. Performance of the method and utilization of the web interface is shown using metabolomics data for Alzheimer’s disease. Conclusions SMILE was able to identify several influential metabolites on AD and to provide interpretable predictive models that can be further used for a better understanding of the metabolic background of AD. SMILE addresses the emerging issue of interpretability and explainability in machine learning, and contributes to more transparent and powerful applications of machine learning in bioinformatics.

[1]  D. Butterfield,et al.  Oxidative Stress, Amyloid-β Peptide, and Altered Key Molecular Pathways in the Pathogenesis and Progression of Alzheimer’s Disease , 2018, Journal of Alzheimer's disease : JAD.

[2]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[3]  Victor Ciesielski,et al.  Linear genetic programming , 2008, Genetic Programming and Evolvable Machines.

[4]  G. Taibi,et al.  Alzheimer’s disease: amino acid levels and brain metabolic status , 2013, Neurological Sciences.

[5]  A. Copani,et al.  Alzheimer's disease: brain expression of a metabolic disorder? , 2010, Trends in Endocrinology & Metabolism.

[6]  S. Ferreira,et al.  Diet-Derived Fatty Acids, Brain Inflammation, and Mental Health , 2019, Front. Neurosci..

[7]  Kwanjeera Wanichthanarak,et al.  Deep metabolome: Applications of deep learning in metabolomics , 2020, Computational and structural biotechnology journal.

[8]  M. Tomás,et al.  Oxidative stress and the amyloid beta peptide in Alzheimer’s disease , 2017, Redox biology.

[9]  Timothy J. Hohman,et al.  Dysregulation of multiple metabolic networks related to brain transmethylation and polyamine pathways in Alzheimer disease: A targeted metabolomic and transcriptomic study , 2020, PLoS medicine.

[10]  F. Gomez-Pinilla,et al.  Cerebral Fructose Metabolism as a Potential Mechanism Driving Alzheimer’s Disease , 2020, Frontiers in Aging Neuroscience.

[11]  Lars M Blank,et al.  Machine Learning Applications for Mass Spectrometry-Based Metabolomics , 2020, Metabolites.

[12]  M. Hallett,et al.  Imaging Neuroinflammation in Alzheimer's Disease with Radiolabeled Arachidonic Acid and PET , 2008, Journal of Nuclear Medicine.

[13]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[14]  Ting Hu,et al.  An evolutionary learning and network approach to identifying key metabolites for osteoarthritis , 2018, PLoS Comput. Biol..

[15]  Yue Huang,et al.  Metabolomics: a novel approach to identify potential diagnostic biomarkers and pathogenesis in Alzheimer’s disease , 2012, Neuroscience Bulletin.

[16]  W. Xu,et al.  Plasma metabolite profiles of Alzheimer's disease and mild cognitive impairment. , 2014, Journal of proteome research.

[17]  A. Goate,et al.  Risk for Alzheimer's disease correlates with transcriptional activity of the APOE gene. , 1998, Human molecular genetics.

[18]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[19]  M. Čuperlović-Culf,et al.  Recent advances from metabolomics and lipidomics application in alzheimer’s disease inspiring drug discovery , 2020, Expert opinion on drug discovery.

[20]  C. Savage,et al.  Left lateralized cerebral glucose metabolism declines in amyloid-β positive persons with mild cognitive impairment , 2018, NeuroImage: Clinical.

[21]  Ting Hu,et al.  Can Genetic Programming Perform Explainable Machine Learning for Bioinformatics? , 2019, GPTP.

[22]  P. Dodd,et al.  Glutamate–glutamine cycling in Alzheimer's disease , 2007, Neurochemistry International.

[23]  M. M. Gromiha,et al.  Neurodegenerative Diseases – Is Metabolic Deficiency the Root Cause? , 2020, Frontiers in Neuroscience.

[24]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[25]  Strasbourg,et al.  Arachidonic acid in Alzheimer's disease , 2016 .

[26]  L. Schneider,et al.  Brain delivery of supplemental docosahexaenoic acid (DHA): A randomized placebo-controlled clinical trial , 2020, EBioMedicine.

[27]  Miroslava Cuperlovic-Culf,et al.  Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling , 2018, Metabolites.

[28]  D. Allan Butterfield,et al.  Oxidative stress, dysfunctional glucose metabolism and Alzheimer disease , 2019, Nature Reviews Neuroscience.

[29]  Thomas Lengauer,et al.  Classification with correlated features: unreliability of feature ranking and solutions , 2011, Bioinform..

[30]  Carlo Caltagirone,et al.  Plasma fatty acid lipidomics in amnestic mild cognitive impairment and Alzheimer's disease. , 2013, Journal of Alzheimer's disease : JAD.

[31]  Steven E. Arnold,et al.  Brain insulin resistance in type 2 diabetes and Alzheimer disease: concepts and conundrums , 2018, Nature Reviews Neurology.

[32]  Xianlin Han,et al.  Altered bile acid profile in mild cognitive impairment and Alzheimer's disease: Relationship to neuroimaging and CSF biomarkers , 2018, Alzheimer's & Dementia.

[33]  Ting Hu,et al.  Computational Methods for the Discovery of Metabolic Markers of Complex Traits , 2019, Metabolites.

[34]  Y. Tseng,et al.  Effects of sarcosine and N, N-dimethylglycine on NMDA receptor-mediated excitatory field potentials , 2017, Journal of Biomedical Science.

[35]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..