A generative Bayesian model to identify cancer driver genes

Cancer is a disease characterized largely by the accumulation of somatic mutations during the lifetime of a patient. Distinguishing driver mutations from passenger mutations had posed a challenge in modern cancer research. With the state of art of microarray technologies and clinical studies, a large numbers of candidate genes are extracted. Extracting informative genes out of them is essential. In our project we aim to find the cancer driver genes using somatic mutation data and protein protein interaction data. We developed a generative mixture model coupled with Bayesian parameter estimation to estimate background mutation rates and driver probabilities of each gene as well as the proportion of drivers among all sequenced genes. We choose suitable prior distributions for modelling both driver probabilities and background mutations of each gene. We apply our method to ovarian cancer data and numerically estimated the solution. Upon convergence, we are able to discover and identify some new candidate cancer driver genes.

[1]  Steven E. Bayer,et al.  A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. , 1994, Science.

[2]  P. Meltzer,et al.  AIB1, a steroid receptor coactivator amplified in breast and ovarian cancer. , 1997, Science.

[3]  R. McLendon,et al.  Analysis of DNA mismatch repair proteins in human medulloblastoma. , 1998, Clinical cancer research : an official journal of the American Association for Cancer Research.

[4]  J Chang-Claude,et al.  Genetic heterogeneity and penetrance analysis of the BRCA1 and BRCA2 genes in breast cancer families. The Breast Cancer Linkage Consortium. , 1998, American journal of human genetics.

[5]  G. Chenevix-Trench,et al.  Frequent PTEN/MMAC mutations in endometrioid but not serous or mucinous epithelial ovarian tumors. , 1998, Cancer research.

[6]  R. L. Baldwin,et al.  Primary ovarian cancer cultures are resistant to Fas-mediated apoptosis. , 1999, Gynecologic oncology.

[7]  D. Alberts,et al.  Analysis of MLH 1 and MSH 2 Expression in Ovarian Cancer before and after Platinum Drug-based Chemotherapy 1 , 2000 .

[8]  H. Mackay,et al.  Reduced MLH1 expression in breast tumors after primary chemotherapy predicts disease-free survival. , 2000, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[9]  D. Alberts,et al.  Analysis of MLH1 and MSH2 expression in ovarian cancer before and after platinum drug-based chemotherapy. , 2000, Clinical cancer research : an official journal of the American Association for Cancer Research.

[10]  J. Minna,et al.  5' CpG island methylation of the FHIT gene is correlated with loss of gene expression in lung and breast cancer. , 2001, Cancer research.

[11]  D. Scudiero,et al.  Expression and potential role of Fas-associated phosphatase-1 in ovarian cancer. , 2001, The American journal of pathology.

[12]  J. Nadeau,et al.  Finding Genes That Underlie Complex Traits , 2002, Science.

[13]  Frances S. Turner,et al.  POCUS: mining genomic sequence annotation to predict disease genes , 2003, Genome Biology.

[14]  M. Nachtigal,et al.  Identification of a putative autocrine bone morphogenetic protein-signaling pathway in human ovarian surface epithelium and ovarian cancer cells. , 2003, Endocrinology.

[15]  Ting Chen,et al.  An Integrated Probabilistic Model for Functional Prediction of Proteins , 2004, J. Comput. Biol..

[16]  E. Gabrielson,et al.  Frequent down-regulation of HIVEP2 in human breast cancer , 2005, Breast Cancer Research and Treatment.

[17]  E. Duprez Smurf1: a link between cell polarity and ubiquitination. , 2004 .

[18]  S. Husain,et al.  CpG methylation of the FHIT, FANCF, cyclin-D2, BRCA2 and RUNX3 genes in Granulosa cell tumors (GCTs) of ovarian origin , 2004, Molecular Cancer.

[19]  Arnaud Droit,et al.  Proteome profiling of human epithelial ovarian cancer cell line TOV-112D , 2005, Molecular and Cellular Biochemistry.

[20]  Debashis Ghosh,et al.  Identification of GATA3 as a breast cancer prognostic marker by global gene expression meta-analysis. , 2005, Cancer research.

[21]  Gert Vriend,et al.  GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases , 2005, Nucleic Acids Res..

[22]  Ram Samudrala,et al.  Functional annotation from predicted protein interaction networks , 2005, Bioinform..

[23]  K. Hillan,et al.  Altered patterns of transcription of the septin gene, SEPT9, in ovarian tumorigenesis , 2006, International journal of cancer.

[24]  Jason Y. Liu,et al.  Analysis of protein sequence and interaction data for candidate disease gene prediction , 2006, Nucleic acids research.

[25]  Luca Benini,et al.  TOM: a web-based integrated approach for identification of candidate disease genes , 2006, Nucleic Acids Res..

[26]  A. Kallioniemi,et al.  A comprehensive expression survey of bone morphogenetic proteins in breast cancer highlights the importance of BMP4 and BMP7 , 2007, Breast Cancer Research and Treatment.

[27]  C. Catapano,et al.  Cellular, Molecular Consequences of Peroxisome Proliferator- Activated Receptor-δ Activation in Ovarian Cancer Cells , 2006 .

[28]  Bassem A. Hassan,et al.  Gene prioritization through genomic data fusion , 2006, Nature Biotechnology.

[29]  W. Willett,et al.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer , 2007, Nature Genetics.

[30]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[31]  Douglas F. Easton,et al.  Tagging Single Nucleotide Polymorphisms in the BRIP1 Gene and Susceptibility to Breast and Ovarian Cancer , 2007, PloS one.

[32]  David Warde-Farley,et al.  GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function , 2008, Genome Biology.

[33]  E. Dermitzakis,et al.  Using gene expression to investigate the genetic basis of complex disorders. , 2008, Human molecular genetics.

[34]  Alexander A. Morgan,et al.  FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease , 2008, Genome Biology.

[35]  B. Ponder,et al.  Allele-Specific Up-Regulation of FGFR2 Increases Susceptibility to Breast Cancer , 2008, PLoS biology.

[36]  Daniel W Chan,et al.  Suppression of annexin A11 in ovarian cancer: implications in chemoresistance. , 2009, Neoplasia.

[37]  Jing Chen,et al.  ToppGene Suite for gene list enrichment analysis and candidate gene prioritization , 2009, Nucleic Acids Res..

[38]  Keivan Majidzadeh-A,et al.  TFRC and ACTB as the best reference genes to quantify Urokinase Plasminogen Activator in breast cancer , 2011, BMC Research Notes.

[39]  Sean D Mooney,et al.  Bioinformatic tools for identifying disease gene and SNP candidates. , 2010, Methods in molecular biology.

[40]  David P. Davis,et al.  Diverse somatic mutation patterns and pathway alterations in human cancers. , 2010, Nature.

[41]  G. Bu,et al.  LRP6 overexpression defines a class of breast cancer subtype and is a target for therapy , 2010, Proceedings of the National Academy of Sciences.

[42]  Mehmet Koyutürk,et al.  Vavien: An Algorithm for Prioritizing Candidate Disease Genes Based on Topological Similarity of Proteins in Interaction Networks , 2011, J. Comput. Biol..

[43]  Kathleen R. Cho,et al.  Human ovarian carcinoma–associated mesenchymal stem cells regulate cancer stem cells and tumorigenesis via altered BMP production. , 2011, The Journal of clinical investigation.

[44]  Jie Zhou,et al.  Increased Expression of Dachshund Homolog 1 in Ovarian Cancer as a Predictor for Poor Outcome , 2012, International Journal of Gynecologic Cancer.

[45]  Baolin Wu,et al.  Signed network propagation for detecting differential gene expressions and DNA copy number variations , 2012, BCB.

[46]  Giorgio Valentini,et al.  Random Walking on Functional Interaction Networks to Rank Genes Involved in Cancer , 2012, AIAI.

[47]  Steven J. M. Jones,et al.  Recurrent somatic DICER1 mutations in nonepithelial ovarian cancers. , 2012, The New England journal of medicine.

[48]  W. Willett,et al.  A genome-wide association study identifies alleles in FGFR 2 associated with risk of sporadic postmenopausal breast cancer , 2012 .

[49]  Vipin Kumar,et al.  Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers , 2013, BMC Genomics.

[50]  Giorgio Valentini,et al.  Cancer module genes ranking using kernelized score functions , 2012, BMC Bioinformatics.

[51]  Ali Shojaie,et al.  Using random walks to identify cancer-associated modules in expression data , 2013, BioData Mining.

[52]  Jie Chen,et al.  SPARC Is a Key Regulator of Proliferation, Apoptosis and Invasion in Human Ovarian Cancer , 2012, PloS one.

[53]  Martijn A. Huynen,et al.  Genome evolution predicts genetic interactions in protein complexes and reveals cancer drug targets , 2013, Nature Communications.

[54]  R. Buckanovich,et al.  BMPs morph into new roles in ovarian cancer , 2013, Cell cycle.

[55]  Steven J. M. Jones,et al.  Integrated genomic characterization of endometrial carcinoma , 2013, Nature.

[56]  X. Shen,et al.  Expression and clinical significance of BIRC6 in human epithelial ovarian cancer , 2014, Tumor Biology.

[57]  Lenore Cowen,et al.  New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence , 2014, Bioinform..

[58]  Yixin Chen,et al.  Ranking of cancer genes in Markov chain model through integration of heterogeneous sources of data , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).