Core module biomarker identification with network exploration for breast cancer metastasis

BackgroundIn a complex disease, the expression of many genes can be significantly altered, leading to the appearance of a differentially expressed "disease module". Some of these genes directly correspond to the disease phenotype, (i.e. "driver" genes), while others represent closely-related first-degree neighbours in gene interaction space. The remaining genes consist of further removed "passenger" genes, which are often not directly related to the original cause of the disease. For prognostic and diagnostic purposes, it is crucial to be able to separate the group of "driver" genes and their first-degree neighbours, (i.e. "core module") from the general "disease module".ResultsWe have developed COMBINER: COre Module Biomarker Identification with Network ExploRation. COMBINER is a novel pathway-based approach for selecting highly reproducible discriminative biomarkers. We applied COMBINER to three benchmark breast cancer datasets for identifying prognostic biomarkers. COMBINER-derived biomarkers exhibited 10-fold higher reproducibility than other methods, with up to 30-fold greater enrichment for known cancer-related genes, and 4-fold enrichment for known breast cancer susceptible genes. More than 50% and 40% of the resulting biomarkers were cancer and breast cancer specific, respectively. The identified modules were overlaid onto a map of intracellular pathways that comprehensively highlighted the hallmarks of cancer. Furthermore, we constructed a global regulatory network intertwining several functional clusters and uncovered 13 confident "driver" genes of breast cancer metastasis.ConclusionsCOMBINER can efficiently and robustly identify disease core module genes and construct their associated regulatory network. In the same way, it is potentially applicable in the characterization of any disease that can be probed with microarrays.

[1]  T. Pawson,et al.  ShcA signalling is essential for tumour progression in mouse models of human breast cancer , 2008, The EMBO journal.

[2]  Gary D Bader,et al.  NetPath: a public resource of curated signal transduction pathways , 2010, Genome Biology.

[3]  Klaus Pantel,et al.  Molecular signature associated with bone marrow micrometastasis in human breast cancer. , 2003, Cancer research.

[4]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[5]  Philippe Dessen,et al.  Atlas of Genetics and Cytogenetics in Oncology and Haematology, an Interactive Database , 2000, Nucleic Acids Res..

[6]  Ian O Ellis,et al.  Basal-like breast cancer: a critical review. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[7]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[8]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[9]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[10]  F. Azuaje,et al.  Multiple SVM-RFE for gene selection in cancer classification with expression data , 2005, IEEE Transactions on NanoBioscience.

[11]  Ivan Merelli,et al.  A multilevel data integration resource for breast cancer study , 2010, BMC Systems Biology.

[12]  Urs Eppenberger,et al.  Low E2F1 transcript levels are a strong determinant of favorable breast cancer outcome , 2007, Breast Cancer Research.

[13]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[14]  Donald P Bottaro,et al.  Grb2 signaling in cell motility and cancer. , 2008, Expert opinion on therapeutic targets.

[15]  Qing Wang,et al.  Towards precise classification of cancers based on robust gene functional expression profiles , 2005, BMC Bioinformatics.

[16]  G. Parmigiani,et al.  The Consensus Coding Sequences of Human Breast and Colorectal Cancers , 2006, Science.

[17]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[18]  Caroline C. Friedel,et al.  Reliable gene signatures for microarray classification: assessment of stability and performance , 2006, Bioinform..

[19]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[20]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[21]  M. J. van de Vijver,et al.  Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. , 2006, Journal of the National Cancer Institute.

[22]  Q. Cui,et al.  Identification of high-quality cancer prognostic markers and metastasis network modules , 2010, Nature communications.

[23]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[24]  Wayne A. Phillips,et al.  Mutation of the PIK3CA Gene in Ovarian and Breast Cancer , 2004, Cancer Research.

[25]  J. Friedman Regularized Discriminant Analysis , 1989 .

[26]  Yi Zhang,et al.  Genes associated with breast cancer metastatic to bone. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[27]  Bonnie LaFleur,et al.  Expression profiling of medulloblastoma: PDGFRA and the RAS/MAPK pathway as therapeutic targets for metastatic disease , 2001, Nature Genetics.

[28]  Trey Ideker,et al.  Integrating physical and genetic maps: from genomes to interaction networks , 2007, Nature Reviews Genetics.

[29]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[30]  Jeffrey T. Chang,et al.  Oncogenic pathway signatures in human cancers as a guide to targeted therapies , 2006, Nature.

[31]  S. Elledge,et al.  Expression profiling of medulloblastoma: PDGFRA and the RAS/MAPK pathway as therapeutic targets for metastatic disease , 2003, Nature Genetics.

[32]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[33]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34]  E. Dougherty,et al.  Accurate and Reliable Cancer Classification Based on Probabilistic Inference of Pathway Activity , 2009, PloS one.

[35]  Clifford A. Meyer,et al.  MYC regulation of a “poor-prognosis” metastatic cancer cell state , 2010, Proceedings of the National Academy of Sciences.

[36]  Eytan Domany,et al.  Outcome signature genes in breast cancer: is there a unique set? , 2004, Breast Cancer Research.

[37]  J. Bergh,et al.  Strong Time Dependence of the 76-Gene Prognostic Signature for Node-Negative Breast Cancer Patients in the TRANSBIG Multicenter Independent Validation Series , 2007, Clinical Cancer Research.

[38]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[39]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[40]  E. van Marck,et al.  Nuclear Factor-κB Signature of Inflammatory Breast Cancer by cDNA Microarray Validated by Quantitative Real-time Reverse Transcription-PCR, Immunohistochemistry, and Nuclear Factor-κB DNA-Binding , 2006, Clinical Cancer Research.

[41]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[42]  D Komitowski,et al.  Allelic imbalance on chromosome 13q: evidence for the involvement of BRCA2 and RB1 in sporadic breast cancer. , 1996, Cancer research.