Network-based classification of recurrent endometrial cancers using high-throughput DNA methylation data

DNA methylation, a well-studied mechanism of epigenetic regulation, plays important roles in cancer. Increased levels of global DNA methylation is observed in primary solid tumors including endometrial carcinomas and is generally associated with silencing of tumor suppressor genes. The role of DNA methylation in cancer recurrence after therapeutic intervention is not clear. Here, we developed a novel computational method to analyze whole-genome DNA methylation data for endometrial tumors within the context of a human protein-protein interaction (PPI) network, in order to identify subnetworks as potential epigenetic biomarkers for predicting tumor recurrence. Our method consists of the following steps. First, differentially methylated (DM) genes between recurrent and non-recurrent tumors are identified and mapped onto a human PPI network. Then, a PPI subnetwork consisting of DM genes and genes that are topologically important for connecting the DMs on the PPI network, termed epigenetic connectors (ECs), are extracted using a Steiner-tree based algorithm. Finally, a random-walk based machine learning method is used to propagate the DNA methylation scores from the DMs to the ECs, which enables the ECs to be used as features in a support vector machine classifier for predicting recurrence. Remarkably, we found that while the DMs are not enriched in any cancer-related pathways, the ECs are enriched in many well-known tumorgenesis and metastasis pathways and include known epigenetic regulators. Moreover, combining the DMs and ECs significantly improves the prediction accuracy of cancer recurrence and outperforms several alternative methods. Therefore, the network-based method is effective in identifying gene subnetworks that are crucial both for the understanding and prediction of tumor recurrence.

[1]  M. Zampieri,et al.  Epigenetics: poly(ADP‐ribosyl)ation of PARP‐1 regulates genomic methylation patterns , 2009, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Richard M. Karp,et al.  DEGAS: De Novo Discovery of Dysregulated Pathways in Human Diseases , 2010, PloS one.

[4]  Ian Witten,et al.  Data Mining , 2000 .

[5]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  L. Ein-Dor,et al.  Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Dario Strbenac,et al.  Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation. , 2010, Genome research.

[8]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[9]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[10]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[11]  Jianhua Ruan,et al.  Identification of biomarkers in breast cancer metastasis by integrating protein-protein interaction network and gene expression data , 2011, 2011 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS).

[12]  S. Kasif,et al.  Network-Based Analysis of Affected Biological Processes in Type 2 Diabetes Models , 2007, PLoS genetics.

[13]  Manel Esteller,et al.  Chromatin remodeling in mammary gland differentiation and breast tumorigenesis. , 2010, Cold Spring Harbor perspectives in biology.

[14]  Teresa M. Przytycka,et al.  Identifying Causal Genes and Dysregulated Pathways in Complex Diseases , 2011, PLoS Comput. Biol..

[15]  S. K. Zaidi,et al.  Transcriptional corepressor TLE1 functions with Runx2 in epigenetic repression of ribosomal RNA genes , 2010, Proceedings of the National Academy of Sciences.

[16]  M. Esteller,et al.  DNA methylation and cancer. , 2010, Advances in genetics.

[17]  Michael L. Gatza,et al.  A pathway-based classification of human breast cancer , 2010, Proceedings of the National Academy of Sciences.

[18]  Jianhua Ruan,et al.  A randomized steiner tree approach for biomarker discovery and classification of breast cancer metastasis , 2012 .

[19]  V. J. Rayward-Smith,et al.  The computation of nearly minimal Steiner trees in graphs , 1983 .

[20]  F. Althaus,et al.  Poly(ADP-ribose): a co-regulator of DNA methylation? , 2005, Oncogene.

[21]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[22]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[23]  A. Mariani,et al.  HER2 gene amplification and EGFR expression in a large cohort of surgically staged patients with nonendometrioid (type II) endometrial cancer , 2008, British Journal of Cancer.

[24]  P. Radivojac,et al.  An integrated approach to inferring gene–disease associations in humans , 2008, Proteins.

[25]  Amanda J Hummer,et al.  Surgical resection of recurrent endometrial carcinoma. , 2006, Gynecologic oncology.

[26]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[27]  Natalie Wilson,et al.  Human Protein Reference Database , 2004, Nature Reviews Molecular Cell Biology.

[28]  Q. Cui,et al.  Identification of high-quality cancer prognostic markers and metastasis network modules , 2010, Nature communications.

[29]  Vipin Kumar,et al.  Robust and efficient identification of biomarkers by classifying features on graphs , 2008, Bioinform..

[30]  Zhiping Weng,et al.  Identification of functional modules that correlate with phenotypic difference: the influence of network topology , 2010, Genome Biology.

[31]  R. Simon,et al.  Sample size determination in microarray experiments for class comparison and prognostic classification. , 2005, Biostatistics.

[32]  David Serre,et al.  MBD-isolated Genome Sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome , 2009, Nucleic acids research.

[33]  Salim A. Chowdhury,et al.  Subnetwork State Functions Define Dysregulated Subnetworks in Cancer , 2010, RECOMB.

[34]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[35]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[36]  Stefan Voß,et al.  Steiner's Problem in Graphs: Heuristic Methods , 1992, Discret. Appl. Math..

[37]  Paul A. Bates,et al.  Global topological features of cancer proteins in the human interactome , 2006, Bioinform..

[38]  Haiyuan Yu,et al.  Network-based methods for human disease gene prediction. , 2011, Briefings in functional genomics.

[39]  S. Nair,et al.  Estrogen Receptor-beta Mediates the Protective Effects of Aromatase Induction in the MMTV-Her-2/neu x Aromatase Double Transgenic Mice , 2012, Hormones and Cancer.

[40]  Eli Upfal,et al.  Algorithms for Detecting Significantly Mutated Pathways in Cancer , 2010, RECOMB.

[41]  H. Dvorak,et al.  Concordant release of glycolysis proteins into the plasma preceding a diagnosis of ER+ breast cancer. , 2012, Cancer research.

[42]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[43]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[44]  T. Rauch,et al.  DNA methylation profiling using the methylated-CpG island recovery assay (MIRA). , 2010, Methods.

[45]  Christina Backes,et al.  A novel algorithm for detecting differentially regulated paths based on gene set enrichment analysis , 2009, Bioinform..

[46]  Albert-László Barabási,et al.  A Dynamic Network Approach for the Study of Human Phenotypes , 2009, PLoS Comput. Biol..

[47]  Martin Ester,et al.  Optimally discriminative subnetwork markers predict response to chemotherapy , 2011, Bioinform..

[48]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[49]  Nicola J. Mulder,et al.  From sets to graphs: towards a realistic enrichment analysis of transcriptomic systems , 2011, Bioinform..

[50]  Jingqin Luo,et al.  Promoter hypermethylation of CIDEA, HAAO and RXFP3 associated with microsatellite instability in endometrial carcinomas. , 2010, Gynecologic oncology.