Topologically inferring risk-active pathways toward precise cancer classification by directed random walk

MOTIVATION The accurate prediction of disease status is a central challenge in clinical cancer research. Microarray-based gene biomarkers have been identified to predict outcome and outperform traditional clinical parameters. However, the robustness of the individual gene biomarkers is questioned because of their little reproducibility between different cohorts of patients. Substantial progress in treatment requires advances in methods to identify robust biomarkers. Several methods incorporating pathway information have been proposed to identify robust pathway markers and build classifiers at the level of functional categories rather than of individual genes. However, current methods consider the pathways as simple gene sets but ignore the pathway topological information, which is essential to infer a more robust pathway activity. RESULTS Here, we propose a directed random walk (DRW)-based method to infer the pathway activity. DRW evaluates the topological importance of each gene by capturing the structure information embedded in the directed pathway network. The strategy of weighting genes by their topological importance greatly improved the reproducibility of pathway activities. Experiments on 18 cancer datasets showed that the proposed method yielded a more accurate and robust overall performance compared with several existing gene-based and pathway-based classification methods. The resulting risk-active pathways are more reliable in guiding therapeutic selection and the development of pathway-specific therapeutic strategies. AVAILABILITY DRW is freely available at http://210.46.85.180:8080/DRWPClass/

[1]  David Galas,et al.  Systems biology of interstitial lung diseases: integration of mRNA and microRNA expression changes , 2011, BMC Medical Genomics.

[2]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[3]  Ivan Rusyn,et al.  Gene expression in nontumoral liver tissue and recurrence-free survival in hepatitis C virus-positive hepatocellular carcinoma , 2010, Molecular Cancer.

[4]  Ambuj K. Singh,et al.  Analysis of protein-protein interaction networks using random walks , 2005, BIOKDD.

[5]  Zhiping Weng,et al.  Identification of functional modules that correlate with phenotypic difference: the influence of network topology , 2010, Genome Biology.

[6]  Yi-Cheng Zhang,et al.  Leaders in Social Networks, the Delicious Case , 2011, PloS one.

[7]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[8]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[9]  Francis J. Doyle,et al.  Core module biomarker identification with network exploration for breast cancer metastasis , 2012, BMC Bioinformatics.

[10]  Natalia Shulzhenko,et al.  Microarrays for cancer diagnosis and classification. , 2007, Advances in experimental medicine and biology.

[11]  W. Jiang,et al.  Loss of tight junction barrier function and its role in cancer metastasis. , 2009, Biochimica et biophysica acta.

[12]  A. Dupuy,et al.  Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. , 2007, Journal of the National Cancer Institute.

[13]  Jeffrey T. Chang,et al.  Oncogenic pathway signatures in human cancers as a guide to targeted therapies , 2006, Nature.

[14]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Charles DeLisi,et al.  Pathway-based classification of cancer subtypes , 2012, Biology Direct.

[17]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[18]  F. Abdul-Karim,et al.  HER2/ErbB2-induced Breast Cancer Cell Migration and Invasion Require p120 Catenin Activation of Rac1 and Cdc42* , 2010, The Journal of Biological Chemistry.

[19]  Xujing Wang,et al.  TAPPA: topological analysis of pathway phenotype association , 2007, Bioinform..

[20]  Mitch Dowsett,et al.  Current and emerging biomarkers in breast cancer: prognosis and prediction. , 2010, Endocrine-related cancer.

[21]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.

[22]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[23]  E. Dougherty,et al.  Accurate and Reliable Cancer Classification Based on Probabilistic Inference of Pathway Activity , 2009, PloS one.

[24]  Jagdish Chandra Patra,et al.  Integration of multiple data sources to prioritize candidate genes using discounted rating system , 2010, BMC Bioinformatics.

[25]  S. Wacholder,et al.  Gene Expression Signature of Cigarette Smoking and Its Role in Lung Adenocarcinoma Development and Survival , 2008, PloS one.

[26]  Kenneth H. Buetow,et al.  Identification of Key Processes Underlying Cancer Phenotypes Using Biologic Pathway Analysis , 2007, PloS one.

[27]  Michael Griffin,et al.  Gene co-expression network topology provides a framework for molecular characterization of cellular state , 2004, Bioinform..

[28]  Mohammed J. Zaki,et al.  Proceedings of the 5th international workshop on Bioinformatics , 2005, KDD 2005.

[29]  Gurpreet W. Tang,et al.  Systematic sequencing of renal carcinoma reveals inactivation of histone modifying genes , 2009, Nature.

[30]  John S. Condeelis,et al.  Identification and Testing of a Gene Expression Signature of Invasive Carcinoma Cells within Primary Mammary Tumors , 2004, Cancer Research.

[31]  Holger Sültmann,et al.  Global gene expression analysis reveals specific patterns of cell junctions in non-small cell lung cancer subtypes. , 2009, Lung cancer.

[32]  B Angus,et al.  Expression of c-erbB-2 oncoprotein: a prognostic indicator in human breast cancer. , 1989, Cancer research.

[33]  J. Condeelis,et al.  Regulation of the actin cytoskeleton in cancer cell migration and invasion. , 2007, Biochimica et biophysica acta.

[34]  Lei Ding,et al.  Hydroxycamptothecin‐loaded Fe3O4 nanoparticles induce human lung cancer cell apoptosis through caspase‐8 pathway activation and disrupt tight junctions , 2011, Cancer science.

[35]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[36]  Tsutomu Ohta,et al.  Overexpression of cortactin is involved in motility and metastasis of hepatocellular carcinoma. , 2004, Journal of hepatology.

[37]  Hongmin Li,et al.  A Precisely Regulated Gene Expression Cassette Potently Modulates Metastasis and Survival in Multiple Solid Cancers , 2008, PLoS genetics.

[38]  Daniel F Hayes,et al.  c-erbB-2 in breast cancer: development of a clinically useful marker. , 2002, Seminars in oncology.

[39]  P. Hall,et al.  An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Alessandro Giuliani,et al.  Genome-wide expression profile of sporadic gastric cancers with microsatellite instability. , 2009, European journal of cancer.

[41]  P. Finn,et al.  Hubs in biological interaction networks exhibit low changes in expression in experimental asthma , 2007, Molecular systems biology.

[42]  T. Macdonald,et al.  Inhibition of human prostate cancer proliferation in vitro and in a mouse model by a compound synthesized to block Ca2+ entry. , 2000, Cancer research.

[43]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[44]  Qing Wang,et al.  Towards precise classification of cancers based on robust gene functional expression profiles , 2005, BMC Bioinformatics.

[45]  L. Ein-Dor,et al.  Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[47]  Chunquan Li,et al.  SubpathwayMiner: a software package for flexible identification of pathways , 2009, Nucleic acids research.

[48]  L. Holmberg,et al.  Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts , 2005, Breast Cancer Research.

[49]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[50]  Qianlan Yao,et al.  Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways , 2013, Nucleic acids research.

[51]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[52]  Marie Joseph,et al.  Gene Signatures of Progression and Metastasis in Renal Cell Cancer , 2005, Clinical Cancer Research.

[53]  Holger Fröhlich,et al.  Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients , 2010, Bioinform..

[54]  Holger Fröhlich,et al.  pathClass: an R-package for integration of pathway knowledge into support vector machines for biomarker discovery , 2011, Bioinform..