Rough Hypercuboid and Modified Kulczynski Coefficient for Disease Gene Identification

The most important objective of human genetics research is the discovery of genes associated to a disease. In this respect, a new algorithm for gene selection is presented, which integrates wisely the information from expression profiles of genes and protein-protein interaction networks. The rough hypercuboid approach is used for identifying differentially expressed genes from the microarray, while a new measure of similarity is proposed to exploit the interaction network of proteins and therefore, determine the pairwise functional similarity of proteins. The proposed algorithm aims to maximize the relevance and functional similarity, and utilizes it as an objective function for the identification of a subset of genes that it predicts as disease genes. The performance of the proposed algorithm is compared with other related methods using some cancer associated data sets.

[1]  SantoniDaniele,et al.  An Integrated Approach (CLuster Analysis Integration Method) to Combine Expression Data and Protein–Protein Interaction Networks in Agrigenomics: Application on Arabidopsis thaliana , 2014 .

[2]  K. Chou,et al.  Identification of Colorectal Cancer Related Genes with mRMR and Shortest Path in Protein-Protein Interaction Network , 2012, PloS one.

[3]  Chao Wu,et al.  Integrating gene expression and protein-protein interaction network to prioritize cancer-associated genes , 2012, BMC Bioinformatics.

[4]  Pradipta Maji,et al.  A Rough Hypercuboid Approach for Feature Selection in Approximation Spaces , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  Yves Moreau,et al.  Network Analysis of Differential Expression for the Identification of Disease-Causing Genes , 2009, PloS one.

[6]  Carl Kingsford,et al.  The power of protein interaction networks for associating genes with diseases , 2010, Bioinform..

[7]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[8]  Pornpimol Charoentong,et al.  ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks , 2009, Bioinform..

[9]  C. Schürch,et al.  Regulation of hematopoietic and leukemic stem cells by the immune system , 2014, Cell Death and Differentiation.

[10]  Pradipta Maji,et al.  Gene expression and protein–protein interaction data for identification of colon cancer related genes using f-information measures , 2015, Natural Computing.

[11]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[12]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..

[13]  C. Myers,et al.  Using networks to measure similarity between genes: association index selection , 2013, Nature Methods.

[14]  Wei Zheng,et al.  dmGWAS: dense module searching for genome-wide association studies in protein-protein interaction networks , 2011, Bioinform..

[15]  Pradipta Maji,et al.  Scalable Pattern Recognition Algorithms , 2014, Springer International Publishing.

[16]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[17]  Qing-Yu He,et al.  DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis , 2015, Bioinform..

[18]  A. Amiot,et al.  Microbial dysbiosis and colon carcinogenesis: could colon cancer be considered a bacteria-related disease? , 2013, Therapeutic advances in gastroenterology.

[19]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[20]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[21]  Petter Holme,et al.  Network Properties of Complex Human Disease Genes Identified through Genome-Wide Association Studies , 2009, PloS one.

[22]  Jing Chen,et al.  Disease candidate gene identification and prioritization using protein interaction networks , 2009, BMC Bioinformatics.

[23]  Pradipta Maji,et al.  RelSim: An integrated method to identify disease genes using gene expression profiles and PPIN based similarity measure , 2017, Inf. Sci..

[24]  Pradipta Maji,et al.  Significance and Functional Similarity for Identification of Disease Genes , 2017, IEEE ACM Trans. Comput. Biol. Bioinform..