A novel approach to significant pathway identification using pathway interaction network from PPI data

Discovering and understanding a variety of genetic markers (e.g., SNPs, genes, pathways) related to a certain phenotype of interest is one of the fundamental challenges in recent genetic studies. For this purpose, conventional methods have usually done by detecting significantly differentially expressed genes or SNPs between case and control samples. However, such approaches often produce a large list of potential markers which contain only a few genetic markers truly associated with a given phenotype. That is, their results often include too many false positives about phenotype relevant markers. As an alternative, lately, several studies have attempted to identify significant functional modules (or pathways) each of which contains a set of genes involved in a particular biological function or process. These pathway marker findings could be better in uncovering complex disease mechanism than individual gene marker findings. This paper investigates a novel approach to significant pathway identification that exploits pathway interaction network (PIN) derived from protein-protein interaction (PPI) data. Specifically, we first construct PIN which indicates the hidden associations between biological pathways, by exploring PPI data and then prioritize pathway nodes over PIN with PIN-PageRank algorithm to identify significant pathways. In this procedure, we employ differentially expressed gene profiles for PIN node initialization. To evaluate efficacy and usability of our proposed approach, we performed experiments for the identification of breast cancer relevant pathways and compared these results with existing approaches like GSEA and DAVID. Overall, it was observed that our PIN-PageRank approach outperforms existing approaches in finding significant pathways.

[1]  Bart De Moor,et al.  Endeavour update: a web resource for gene prioritization in multiple species , 2008, Nucleic Acids Res..

[2]  Rohaizak Muhammad,et al.  Gene expression patterns distinguish breast carcinomas from normal breast tissues: the Malaysian context. , 2010, Pathology, research and practice.

[3]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[4]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Hong Wang,et al.  Prioritizing risk pathways: a novel association approach to searching for disease pathways fusing SNPs and pathways , 2009, Bioinform..

[6]  Yves Moreau,et al.  PINTA: a web server for network-based gene prioritization from expression data , 2011, Nucleic Acids Res..

[7]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[8]  Xiaoli Li,et al.  Inferring Gene-Phenotype Associations via Global Protein Complex Network Propagation , 2011, PloS one.

[9]  Xing-Ming Zhao,et al.  Identifying dysregulated pathways in cancers from pathway interaction networks , 2012, BMC Bioinformatics.

[10]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[11]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[12]  David J. Porteous,et al.  Speeding disease gene discovery by sequence based candidate prioritization , 2005, BMC Bioinformatics.

[13]  Jianmin Wu,et al.  PINA v2.0: mining interactome modules , 2011, Nucleic Acids Res..

[14]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[15]  Richard M. Karp,et al.  DEGAS: De Novo Discovery of Dysregulated Pathways in Human Diseases , 2010, PloS one.

[16]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[17]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .