Bayesian evolutionary hypergraph learning for predicting cancer clinical outcomes

Predicting the clinical outcomes of cancer patients is a challenging task in biomedicine. A personalized and refined therapy based on predicting prognostic outcomes of cancer patients has been actively sought in the past decade. Accurate prognostic prediction requires higher-order representations of complex dependencies among genetic factors. However, identifying the co-regulatory roles and functional effects of genetic interactions on cancer prognosis is hindered by the complexity of the interactions. Here we propose a prognostic prediction model based on evolutionary learning that identifies higher-order prognostic biomarkers of cancer clinical outcomes. The proposed model represents the interactions of prognostic genes as a combinatorial space. It adopts a flexible hypergraph structure composed of a large population of hyperedges that encode higher-order relationships among many genetic factors. The hyperedge population is optimized by an evolutionary learning method based on sequential Bayesian sampling. The proposed learning approach effectively balances performance and parsimony of the model using information-theoretic dependency and complexity-theoretic regularization priors. Using MAQC-II project data, we demonstrate that our model can handle high-dimensional data more effectively than state-of-the-art classification models. We also identify potential gene interactions characterizing prognosis and recurrence risk in cancer.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  Bart De Moor,et al.  Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks , 2006, ISMB.

[3]  Oliviero Carugo,et al.  Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots , 2007, BMC Bioinformatics.

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  Giu-Cheng Hsu,et al.  Breast cancer risk associated with genotypic polymorphism of the mitotic checkpoint genes: a multigenic study on cancer susceptibility. , 2006, Carcinogenesis.

[6]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[7]  Dennis R. Durbin,et al.  The learning classifier system: an evolutionary computation approach to knowledge discovery in epidemiologic surveillance , 2000, Artif. Intell. Medicine.

[8]  Byoung-Tak Zhang,et al.  Constructing higher-order miRNA-mRNA interaction networks in prostate cancer via hypergraph-based learning , 2013, BMC Systems Biology.

[9]  A. Bloem,et al.  New treatment strategies for multiple myeloma by targeting BCL-2 and the mevalonate pathway. , 2006, Current pharmaceutical design.

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Stefano Cagnoni,et al.  Genetic and Evolutionary Computation: Medical Applications , 2011 .

[12]  Yuan Qi,et al.  Centromere protein-A, an essential centromere protein, is a prognostic marker for relapse in estrogen receptor-positive breast cancer , 2011, Breast Cancer Research.

[13]  Marina Ruggeri,et al.  Cyclooxygenase-2 (COX-2) is frequently expressed in multiple myeloma and is an independent predictor of poor outcome. , 2005, Blood.

[14]  Xiang Zhang,et al.  Evolutionary Computation Applications in Current Bioinformatics , 2010 .

[15]  Weida Tong,et al.  DNA Microarrays Are Predictive of Cancer Prognosis: A Re-evaluation , 2010, Clinical Cancer Research.

[16]  Hui Xiong,et al.  Hypergraph partitioning for document clustering: a unified clique perspective , 2008, SIGIR '08.

[17]  Douglas B. Kell,et al.  Multiobjective Optimization in Bioinformatics and Computational Biology , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[18]  Sinan Zhu,et al.  Classical and Novel Prognostic Markers for Breast Cancer and their Clinical Significance , 2010, Clinical Medicine Insights. Oncology.

[19]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[20]  Yipeng Wang,et al.  The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis , 2009, Bioinform..

[21]  David Corne,et al.  Evolutionary Computation In Bioinformatics , 2003 .

[22]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Terence Soule,et al.  Genetic Programming Theory and Practice IV , 2007 .

[24]  R Simon,et al.  Diagnostic and prognostic prediction using gene expression profiles in high-dimensional microarray data , 2003, British Journal of Cancer.

[25]  Moshe Sipper,et al.  Evolutionary computation in medicine: an overview , 2000, Artif. Intell. Medicine.

[26]  Steffen Klamt,et al.  Hypergraphs and Cellular Networks , 2009, PLoS Comput. Biol..

[27]  Anne-Laure Boulesteix,et al.  Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value , 2008, Bioinform..

[28]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[29]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[30]  Byoung-Tak Zhang,et al.  Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and Memory , 2008, IEEE Computational Intelligence Magazine.

[31]  Jason H. Moore,et al.  Evolutionary Computation in Microarray Data Analysis , 2002 .

[32]  Sanghamitra Bandyopadhyay,et al.  Classification and learning using genetic algorithms - applications in bioinformatics and web intelligence , 2007, Natural computing series.

[33]  Narayanan Unny Edakunni,et al.  Modeling UCS as a mixture of experts , 2009, GECCO '09.

[34]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[35]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[36]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[37]  Gary B. Fogel,et al.  Evolutionary computation for discovery of composite transcription factor binding sites , 2008, Nucleic acids research.

[38]  Maqc Consortium The MicroArray Quality Control ( MAQC )-II study of common practices for the development and validation of microarray-based predictive models , 2012 .

[39]  Purvesh Khatri,et al.  Onto-Tools: new additions and improvements in 2006 , 2007, Nucleic Acids Res..

[40]  Jonathan E. Rowe,et al.  An evolutionary approach to constructing prognostic models , 1999, Artif. Intell. Medicine.

[41]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[42]  Ju Han Kim,et al.  Synergistic effect of different levels of genomic data for cancer clinical outcome prediction , 2012, J. Biomed. Informatics.

[43]  Byoung-Tak Zhang,et al.  Evolutionary layered hypernetworks for identifying microRNA-mRNA regulatory modules , 2010, IEEE Congress on Evolutionary Computation.

[44]  M. West,et al.  Gene expression predictors of breast cancer outcomes , 2003, The Lancet.

[45]  Smaranda Belciug,et al.  A hybrid neural network/genetic algorithm applied to breast cancer detection and recurrence , 2013, Expert Syst. J. Knowl. Eng..

[46]  Peter Korosec New Achievements in Evolutionary Computation , 2010 .

[47]  L. Ohno-Machado Journal of Biomedical Informatics , 2001 .

[48]  Jiuyong Li,et al.  Combined Feature Selection and Cancer Prognosis Using Support Vector Machine Regression , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[49]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.