Pathway Enrichment Analysis with Networks

Detecting associations between an input gene set and annotated gene sets (e.g., pathways) is an important problem in modern molecular biology. In this paper, we propose two algorithms, termed NetPEA and NetPEA’, for conducting network-based pathway enrichment analysis. Our algorithms consider not only shared genes but also gene–gene interactions. Both algorithms utilize a protein–protein interaction network and a random walk with a restart procedure to identify hidden relationships between an input gene set and pathways, but both use different randomization strategies to evaluate statistical significance and as a result emphasize different pathway properties. Compared to an over representation-based method, our algorithms can identify more statistically significant pathways. Compared to an existing network-based algorithm, EnrichNet, our algorithms have a higher sensitivity in revealing the true causal pathways while at the same time achieving a higher specificity. A literature review of selected results indicates that some of the novel pathways reported by our algorithms are biologically relevant and important. While the evaluations are performed only with KEGG pathways, we believe the algorithms can be valuable for general functional discovery from high-throughput experiments.

[1]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[2]  Andrey Alexeyenko,et al.  Network enrichment analysis: extension of gene-set enrichment analysis to gene networks , 2012, BMC Bioinformatics.

[3]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[5]  Jiawei Han,et al.  A Unified Framework for Link Recommendation Using Random Walks , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[6]  Tim Hui-Ming Huang,et al.  Network-based classification of recurrent endometrial cancers using high-throughput DNA methylation data , 2012, BCB.

[7]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[8]  TaeHyun Hwang,et al.  Inferring disease and gene set associations with rank coherence in networks , 2011, Bioinform..

[9]  Alfonso Valencia,et al.  EnrichNet: network-based gene set enrichment analysis , 2012, Bioinform..

[10]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[11]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[12]  Rob Pieters,et al.  Inhibition of FLT3 in MLL. Validation of a therapeutic target identified by gene expression based classification. , 2003, Cancer cell.

[13]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[14]  M. Horowitz,et al.  Expression of taste molecules in the upper gastrointestinal tract in humans with and without type 2 diabetes , 2008, Gut.

[15]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[16]  Marc Prentki,et al.  Glycerolipid metabolism and signaling in health and disease. , 2008, Endocrine reviews.

[17]  Marc Prentki,et al.  Glycerolipid/free fatty acid cycle and islet β-cell function in health, obesity and diabetes , 2012, Molecular and Cellular Endocrinology.

[18]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Stefano Monti,et al.  Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome. , 2007, Blood.

[20]  Muin J. Khoury,et al.  Phenopedia and Genopedia: disease-centered and gene-centered views of the evolving knowledge of human genetic associations , 2009, Bioinform..

[21]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[22]  E-P Chiang,et al.  Folate restriction and methylenetetrahydrofolate reductase 677T polymorphism decreases adoMet synthesis via folate-dependent remethylation in human-transformed lymphoblasts , 2007, Leukemia.

[23]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.