RelSim: An integrated method to identify disease genes using gene expression profiles and PPIN based similarity measure

Abstract One of the important problems in functional genomics is how to select the disease genes. In this regard, the paper presents a new gene selection algorithm, termed as RelSim, to identify disease genes. It integrates judiciously the information of gene expression profiles and protein-protein interaction networks. A new similarity measure is introduced to compute the functional similarity between two genes. It is based on the information of protein-protein interaction networks. The new similarity measure offers an efficient way to calculate the functional similarity between two genes. The proposed algorithm selects a set of genes as disease genes, considering both microarray and protein-protein interaction data, by maximizing the relevance and functional similarity of the selected genes. While gene expression profiles are used to identify differentially expressed genes, the protein-protein interaction networks help to compute the functional similarity among genes. The performance of the proposed algorithm, along with a comparison with other related methods, is demonstrated on several colon cancer data sets.

[1]  J. Penninger,et al.  From T‐cell activation signals to signaling control of anti‐cancer immunity , 2007, Immunological reviews.

[2]  Petter Holme,et al.  Ranking Candidate Disease Genes from Gene Expression and Protein Interaction: A Katz-Centrality Based Approach , 2011, PloS one.

[3]  Nikos Vlassis,et al.  GenePEN: analysis of network activity alterations in complex diseases via the pairwise elastic net , 2015, Statistical applications in genetics and molecular biology.

[4]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[5]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[6]  Saralees Nadarajah,et al.  Statistical methods on detecting differentially expressed genes for RNA-seq data , 2011, BMC Systems Biology.

[7]  David Correa Martins,et al.  Identifying dense subgraphs in protein–protein interaction network for gene selection from microarray data , 2015, Network Modeling Analysis in Health Informatics and Bioinformatics.

[8]  Mario Malerba,et al.  Lipid Droplets: A New Player in Colorectal Cancer Stem Cells Unveiled by Spectroscopic Imaging , 2014, Stem cells.

[9]  Daniele Santoni,et al.  An integrated approach (CLuster Analysis Integration Method) to combine expression data and protein-protein interaction networks in agrigenomics: application on Arabidopsis thaliana. , 2014, Omics : a journal of integrative biology.

[10]  K. Chou,et al.  Identification of Colorectal Cancer Related Genes with mRMR and Shortest Path in Protein-Protein Interaction Network , 2012, PloS one.

[11]  Jon Clardy,et al.  FOXO3a mediates the cytotoxic effects of cisplatin in colon cancer cells , 2008, Molecular Cancer Therapeutics.

[12]  Shuji Ogino,et al.  Toll-like receptor signaling in colorectal cancer: carcinogenesis to cancer therapy. , 2014, World journal of gastroenterology.

[13]  T. Hirano,et al.  Roles of STAT3 in mediating the cell growth, differentiation and survival signals relayed through the IL-6 family of cytokine receptors , 2000, Oncogene.

[14]  Kuo-Chen Chou,et al.  Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property , 2011, PloS one.

[15]  SantoniDaniele,et al.  An Integrated Approach (CLuster Analysis Integration Method) to Combine Expression Data and Protein–Protein Interaction Networks in Agrigenomics: Application on Arabidopsis thaliana , 2014 .

[16]  Jochen Hampe,et al.  Functional TLR5 genetic variants affect human colorectal cancer survival. , 2013, Cancer research.

[17]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[18]  Li Liang,et al.  FOXC2 promotes colorectal cancer proliferation through inhibition of FOXO3a and activation of MAPK and AKT signaling pathways. , 2014, Cancer letters.

[19]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[20]  Hui Xiong,et al.  β-Catenin activates the growth factor endothelin-1 in colon cancer cells , 2005, Oncogene.

[21]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[22]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..

[23]  Wei Zheng,et al.  dmGWAS: dense module searching for genome-wide association studies in protein-protein interaction networks , 2011, Bioinform..

[24]  Ying-Xuan Chen,et al.  Inhibition of JAK1, 2/STAT3 signaling induces apoptosis, cell cycle arrest, and reduces tumor cell invasion in colorectal cancer cells. , 2008, Neoplasia.

[25]  Jing Chen,et al.  Disease candidate gene identification and prioritization using protein interaction networks , 2009, BMC Bioinformatics.

[26]  F. Du,et al.  The role of hypoxia-inducible factor-2 in digestive system cancers , 2015, Cell Death and Disease.

[27]  Saeid Nahavandi,et al.  Hidden Markov models for cancer classification using gene expression profiles , 2015, Inf. Sci..

[28]  Yiannis Kourmpetis,et al.  Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data , 2010, PloS one.

[29]  Sanghyun Park,et al.  Direct integration of microarrays for selecting informative genes and phenotype classification , 2008, Inf. Sci..

[30]  Michal A. Kurowski,et al.  Transcriptome Profile of Human Colorectal Adenomas , 2007, Molecular Cancer Research.

[31]  Pradipta Maji,et al.  Scalable Pattern Recognition Algorithms: Applications in Computational Biology and Bioinformatics , 2014 .

[32]  Chao Wu,et al.  Integrating gene expression and protein-protein interaction network to prioritize cancer-associated genes , 2012, BMC Bioinformatics.

[33]  Pradipta Maji,et al.  Gene expression and protein–protein interaction data for identification of colon cancer related genes using f-information measures , 2015, Natural Computing.

[34]  P. Meltzer Spotting the target: microarrays for disease gene discovery. , 2001, Current opinion in genetics & development.

[35]  José Salvador Sánchez,et al.  Mapping microarray gene expression data into dissimilarity spaces for tumor classification , 2015, Inf. Sci..

[36]  Jinyan Li,et al.  Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data , 2012, BMC Genomics.

[37]  Parichehr Hassanzadeh,et al.  Colorectal cancer and NF-κB signaling pathway , 2011, Gastroenterology and hepatology from bed to bench.

[38]  Peng Gang Sun,et al.  The human Drug-Disease-Gene Network , 2015, Inf. Sci..

[39]  Hai-sheng Zhang,et al.  PAX2 Protein Induces Expression of Cyclin D1 through Activating AP-1 Protein and Promotes Proliferation of Colon Cancer Cells* , 2012, The Journal of Biological Chemistry.

[40]  Pornpimol Charoentong,et al.  ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks , 2009, Bioinform..

[41]  Roded Sharan,et al.  A Network-Based Method for Predicting Disease-Causing Genes , 2009, J. Comput. Biol..

[42]  M. Miyasaka,et al.  Chemokines in tumor progression and metastasis , 2005, Cancer science.

[43]  Maozu Guo,et al.  Mining disease genes using integrated protein–protein interaction and gene–gene co-regulation information , 2015, FEBS open bio.

[44]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[45]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[46]  Carl Kingsford,et al.  The power of protein interaction networks for associating genes with diseases , 2010, Bioinform..

[47]  Stanley Letovsky,et al.  Predicting protein function from protein/protein interaction data: a probabilistic approach , 2003, ISMB.

[48]  Jake Yue Chen,et al.  Reordering based integrative expression profiling for microarray classification , 2012, BMC Bioinformatics.

[49]  Antonio Reverter,et al.  A Boolean-based systems biology approach to predict novel genes associated with cancer: Application to colorectal cancer , 2011, BMC Systems Biology.

[50]  Francesco Archetti,et al.  A p-Median approach for predicting drug response in tumour cells , 2014, BMC Bioinformatics.

[51]  Yosuke Osawa,et al.  Liver acid sphingomyelinase inhibits growth of metastatic colon cancer. , 2013, The Journal of clinical investigation.

[52]  E. Dermitzakis From gene expression to disease risk , 2008, Nature Genetics.

[53]  Osamu Yoshie,et al.  Chemokine CXCL16 suppresses liver metastasis of colorectal cancer via augmentation of tumor-infiltrating natural killer T cells in a murine model. , 2013, Oncology reports.

[54]  Xiaojing Quan,et al.  MicroRNA-126 functions as a tumor suppressor in colorectal cancer cells by targeting CXCR4 via the AKT and ERK1/2 signaling pathways. , 2014, International journal of oncology.

[55]  Hui Xiong,et al.  beta-Catenin activates the growth factor endothelin-1 in colon cancer cells. , 2005, Oncogene.

[56]  A. Rizzo,et al.  2-Methoxy-5-Amino-N-Hydroxybenzamide Sensitizes Colon Cancer Cells to TRAIL-Induced Apoptosis by Regulating Death Receptor 5 and Survivin Expression , 2011, Molecular Cancer Therapeutics.