Inferring Disease-Related Metabolite Dependencies with a Bayesian Optimization Algorithm

Understanding disease-related metabolite interactions is a key issue in computational biology. We apply a modified Bayesian Optimization Algorithm to targeted metabolomics data from plasma samples of insulin-sensitive and -resistant subjects both suffering from non-alcoholic fatty liver disease. In addition to improving the classification accuracy by selecting relevant features, we extract the information that led to their selection and reconstruct networks from detected feature dependencies. We compare the influence of a variety of classifiers and different scoring metrics and examine whether the reconstructed networks represent physiological metabolite interconnections. We find that the presented method is capable of significantly improving the classification accuracy of otherwise hardly classifiable metabolomics data and that the detected metabolite dependencies can be mapped to physiological pathways, which in turn were affirmed by literature from the domain.

[1]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[2]  Pedro Larrañaga,et al.  Exact Bayesian network learning in estimation of distribution algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[3]  Norbert Stefan,et al.  Causes and metabolic consequences of Fatty liver. , 2008, Endocrine reviews.

[4]  Michelle M Wiest,et al.  A lipidomic analysis of nonalcoholic fatty liver disease , 2007, Hepatology.

[5]  Judea Pearl,et al.  Bayesian Networks , 1998, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[6]  Andreas Zell,et al.  Wrapper- and Ensemble-Based Feature Subset Selection Methods for Biomarker Discovery in Targeted Metabolomics , 2011, PRIB.

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  William E. Kraus,et al.  Relationships Between Circulating Metabolic Intermediates and Insulin Action in Overweight to Obese, Inactive Men and Women , 2009, Diabetes Care.

[9]  Maguelonne Teisseire,et al.  Successes and New Directions in Data Mining , 2007 .

[10]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[11]  Michael J E Sternberg,et al.  Integrative top-down system metabolic modeling in experimental disease states via data-driven Bayesian methods. , 2008, Journal of proteome research.

[12]  Pedro Larrañaga,et al.  Feature Subset Selection by Bayesian network-based optimization , 2000, Artif. Intell..

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[14]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[15]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[16]  Ron S. Kenett,et al.  Encyclopedia of statistics in quality and reliability , 2007 .

[17]  David E. Goldberg,et al.  Hierarchical Bayesian Optimization Algorithm , 2006, Scalable Optimization via Probabilistic Modeling.

[18]  Andreas Zell,et al.  The EvA2 Optimization Framework , 2010, LION.

[19]  Fabian J. Theis,et al.  Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data , 2011, BMC Systems Biology.

[20]  Justin Doak,et al.  CSE-92-18 - An Evaluation of Feature Selection Methodsand Their Application to Computer Security , 1992 .

[21]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[22]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[23]  Svati H Shah,et al.  A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. , 2009, Cell metabolism.

[24]  D. DeMets,et al.  Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework , 2001, Clinical pharmacology and therapeutics.

[25]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[26]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[27]  V. Tolstikov,et al.  Probing genetic algorithms for feature selection in comprehensive metabolic profiling approach. , 2008, Rapid communications in mass spectrometry : RCM.

[28]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[29]  K. Petersen,et al.  Reversal of nonalcoholic hepatic steatosis, hepatic insulin resistance, and hyperglycemia by moderate weight reduction in patients with type 2 diabetes. , 2005, Diabetes.