The inference of breast cancer metastasis through gene regulatory networks

Understanding the mechanisms of gene regulation during breast cancer is one of the most difficult problems among oncologists because this regulation is likely comprised of complex genetic interactions. Given this complexity, a computational study using the Bayesian network technique has been employed to construct a gene regulatory network from microarray data. Although the Bayesian network has been notified as a prominent method to infer gene regulatory processes, learning the Bayesian network structure is NP hard and computationally intricate. Therefore, we propose a novel inference method based on low-order conditional independence that extends to the case of the Bayesian network to deal with a large number of genes and an insufficient sample size. This method has been evaluated and compared with full-order conditional independence and different prognostic indices on a publicly available breast cancer data set. Our results suggest that the low-order conditional independence method will be able to handle a large number of genes in a small sample size with the least mean square error. In addition, this proposed method performs significantly better than other methods, including the full-order conditional independence and the St. Gallen consensus criteria. The proposed method achieved an area under the ROC curve of 0.79203, whereas the full-order conditional independence and the St. Gallen consensus criteria obtained 0.76438 and 0.73810, respectively. Furthermore, our empirical evaluation using the low-order conditional independence method has demonstrated a promising relationship between six gene regulators and two regulated genes and will be further investigated as potential breast cancer metastasis prognostic markers.

[1]  H. Iba,et al.  Inferring Gene Regulatory Networks using Differential Evolution with Local Search Heuristics , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  A. Dunker,et al.  Identification of a gene signature in cell cycle pathway for breast cancer prognosis using gene expression profiling data , 2008, BMC Medical Genomics.

[3]  F. Towhidkhah,et al.  Gene Regulatory Network Modeling using Bayesian Networks and Cross Correlation , 2008, 2008 Cairo International Biomedical Engineering Conference.

[4]  V. Yadav,et al.  Identification of novel genes regulated by LH in the primate corpus luteum: insight into their regulation during the late luteal phase. , 2004, Molecular human reproduction.

[5]  Rong Chen,et al.  Collective Mining of Bayesian Networks from Distributed Heterogeneous Data , 2004, Knowl. Inf. Syst..

[6]  P. Steeg Metastasis suppressors alter the signal transduction of cancer cells , 2003, Nature Reviews Cancer.

[7]  Robert Castelo,et al.  On Inclusion-Driven Learning of Bayesian Networks , 2003, J. Mach. Learn. Res..

[8]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[9]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[10]  P. Hammer,et al.  Breast cancer prognosis by combinatorial analysis of gene expression data , 2006, Breast Cancer Research.

[11]  S. Kauffman,et al.  Cancer attractors: a systems view of tumors from a gene network dynamics and developmental perspective. , 2009, Seminars in cell & developmental biology.

[12]  Li Liu,et al.  Improved breast cancer prognosis through the combination of clinical and genetic markers , 2007, Bioinform..

[13]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[14]  J. Foekens,et al.  Which Cyclin E Prevails as Prognostic Marker for Breast Cancer? Results from a Retrospective Study Involving 635 Lymph Node–Negative Breast Cancer Patients , 2006, Clinical Cancer Research.

[15]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[16]  K. Vermeulen,et al.  The cell cycle: a review of regulation, deregulation and therapeutic targets in cancer , 2003, Cell proliferation.

[17]  Erik Sahai,et al.  Tumor cells caught in the act of invading: their strategy for enhanced cell motility. , 2005, Trends in cell biology.

[18]  David H. Sharp,et al.  A connectionist model of development. , 1991, Journal of theoretical biology.

[19]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[20]  Robert Castelo,et al.  A Robust Procedure For Gaussian Graphical Model Search From Microarray Data With p Larger Than n , 2006, J. Mach. Learn. Res..

[21]  Yijun Sun,et al.  Derivation of molecular signatures for breast cancer recurrence prediction using a two-way validation approach , 2010, Breast Cancer Research and Treatment.

[22]  M Paesmans,et al.  Cyclin E1 (CCNE1) and E2 (CCNE2) as prognostic and predictive markers for endocrine therapy (ET) in early breast cancer. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[23]  Frank McCormick,et al.  Signalling networks that cause cancer. , 1999 .

[24]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[25]  Zoubin Ghahramani,et al.  A Bayesian approach to reconstructing genetic regulatory networks with hidden factors , 2005, Bioinform..

[26]  Satoru Miyano,et al.  Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[27]  M Wahde,et al.  Coarse-grained reverse engineering of genetic regulatory networks. , 2000, Bio Systems.

[28]  S. Somiari,et al.  Circulating MMP2 and MMP9 in breast cancer—Potential role in classification of patients into low risk, high risk, benign disease and breast cancer categories , 2006, International journal of cancer.

[29]  V. Shane Pankratz,et al.  Association of genetic variation in mitotic kinases with breast cancer risk , 2009, Breast Cancer Research and Treatment.

[30]  Jiawei Han,et al.  Expression of bbc3, a pro-apoptotic BH3-only gene, is regulated by diverse cell death and survival signals , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Yuehui Chen,et al.  Computational Intelligence in Bioinformatics , 2008, Computational Intelligence in Bioinformatics.

[32]  Yi Zhang,et al.  Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer , 2007, BMC Cancer.

[33]  Guy Leclercq,et al.  p53 and breast cancer, an update. , 2006, Endocrine-related cancer.

[34]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[35]  Paul M. Magwene,et al.  Estimating genomic coexpression networks using first-order conditional independence , 2004, Genome Biology.

[36]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[37]  Luis M. de Campos,et al.  A new approach for learning belief networks using independence criteria , 2000, Int. J. Approx. Reason..

[38]  Thiennu H. Vu,et al.  Matrix‐degrading proteases and angiogenesis during development and tumor formation , 1999, APMIS : acta pathologica, microbiologica, et immunologica Scandinavica.

[39]  Liangjiang Wang,et al.  Prediction of DNA-binding residues from protein sequence information using random forests , 2009, BMC Genomics.

[40]  Ralf Herwig,et al.  Reverse Engineering of Gene Regulatory Networks: A Comparative Study , 2009, EURASIP J. Bioinform. Syst. Biol..

[41]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[42]  J. Peterse,et al.  Breast cancer metastasis: markers and models , 2005, Nature Reviews Cancer.

[43]  Tommi S. Jaakkola,et al.  Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models , 2001, Pacific Symposium on Biocomputing.

[44]  Roland Somogyi,et al.  Modeling the complexity of genetic networks: Understanding multigenic and pleiotropic regulation , 1996, Complex..

[45]  João Ricardo Sato,et al.  Time-varying modeling of gene expression regulatory networks using the wavelet dynamic vector autoregressive method , 2007, Bioinform..

[46]  E. Berns,et al.  Luteinizing hormone signaling and breast cancer: polymorphisms and age of onset. , 2003, The Journal of clinical endocrinology and metabolism.

[47]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence , 2004, Computer science and data analysis series.

[48]  A. Scorilas,et al.  Overexpression of matrix-metalloproteinase-9 in human breast cancer: a potential favourable indicator in node-negativeatients , 2001, British Journal of Cancer.

[49]  Marcel J. T. Reinders,et al.  A Comparison of Genetic Network Models , 2000, Pacific Symposium on Biocomputing.

[50]  Xiaosheng Wang,et al.  Accurate molecular classification of cancer using simple rules , 2009, BMC Medical Genomics.

[51]  Takashi Ishikawa,et al.  Current progress in the prediction of chemosensitivity for breast cancer , 2004, Breast cancer.

[52]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[53]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[54]  P. Bühlmann,et al.  Statistical Applications in Genetics and Molecular Biology Low-Order Conditional Independence Graphs for Inferring Genetic Networks , 2011 .

[55]  P. Sonneveld,et al.  Absence of mutations in the deoxycytidine kinase (dCK) gene in patients with relapsed and/or refractory acute myeloid leukemia (AML) , 2001, Leukemia.

[56]  Luke Hughes-Davies,et al.  Amplification of the BRCA2 Pathway Gene EMSY in Sporadic Breast Cancer Is Related to Negative Outcome , 2004, Clinical Cancer Research.

[57]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[58]  R. Weinshilboum,et al.  Gemcitabine Pharmacogenomics: Deoxycytidine Kinase and Cytidylate Kinase Gene Resequencing and Functional Genomics , 2008, Drug Metabolism and Disposition.

[59]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[60]  Eric Mjolsness,et al.  From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation among Gene Classes from Large-Scale Expression Data , 1999, NIPS.

[61]  C. Lohrisch,et al.  Relationship between tumor location and relapse in 6,781 women with early invasive breast cancer. , 2000, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[62]  A. G. de la Fuente,et al.  Gene Network Inference via Structural Equation Modeling in Genetical Genomics Experiments , 2008, Genetics.

[63]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[64]  D. Weaver,et al.  The TNM staging system and breast cancer. , 2003, The Lancet. Oncology.

[65]  Ricki Lewis,et al.  Human Genetics: Concepts and Applications , 1997 .

[66]  D. Malouche Determining full conditional independence by low-order conditioning , 2007, 0705.1613.

[67]  K. J. Ray Liu,et al.  Dependence network modeling for biomarker identification , 2007, Bioinform..