DisSetSim: an online system for calculating similarity between disease sets

BackgroundFunctional similarity between molecules results in similar phenotypes, such as diseases. Therefore, it is an effective way to reveal the function of molecules based on their induced diseases. However, the lack of a tool for obtaining the similarity score of pair-wise disease sets (SSDS) limits this type of application.ResultsHere, we introduce DisSetSim, an online system to solve this problem in this article. Five state-of-the-art methods involving Resnik’s, Lin’s, Wang’s, PSB, and SemFunSim methods were implemented to measure the similarity score of pair-wise diseases (SSD) first. And then “pair-wise-best pairs-average” (PWBPA) method was implemented to calculated the SSDS by the SSD. The system was applied for calculating the functional similarity of miRNAs based on their induced disease sets. The results were further used to predict potential disease-miRNA relationships.ConclusionsThe high area under the receiver operating characteristic curve AUC (0.9296) based on leave-one-out cross validation shows that the PWBPA method achieves a high true positive rate and a low false positive rate. The system can be accessed from http://www.bio-annotation.cn:8080/DisSetSim/.

[1]  Jiajie Peng,et al.  InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology , 2016, BMC Genomics.

[2]  Guohua Wang,et al.  SIDD: A Semantically Integrated Database towards a Global View of Human Disease , 2013, PloS one.

[3]  N. Campbell Genetic association database , 2004, Nature Reviews Genetics.

[4]  Yadong Wang,et al.  Extending gene ontology with gene association networks , 2016, Bioinform..

[5]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[6]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2013 , 2012, Nucleic Acids Res..

[7]  Jiajie Peng,et al.  SemFunSim: A New Method for Measuring Disease Similarity by Integrating Semantic and Gene Functional Association , 2014, PloS one.

[8]  Xuequn Shang,et al.  Predicting disease-related genes using integrated biomedical networks , 2017, BMC Genomics.

[9]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[10]  Carol A. Bocchini,et al.  A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®) , 2011, Human mutation.

[11]  Wanying Xu,et al.  OAHG: an integrated resource for annotating human genes with multi-level ontologies , 2016, Scientific Reports.

[12]  Xiang Li,et al.  DOSim: An R package for similarity between diseases based on Disease Ontology , 2011, BMC Bioinformatics.

[13]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[14]  Joyce A. Mitchell,et al.  Gene Indexing: Characterization and Analysis of NLM's GeneRIFs , 2003, AMIA.

[15]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[16]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[17]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[18]  Fan Zhang,et al.  A network medicine approach to build a comprehensive atlas for the prognosis of human cancer , 2016, Briefings Bioinform..

[19]  Zhu-Hong You,et al.  ILNCSIM: improved lncRNA functional similarity calculation model , 2016, Oncotarget.

[20]  Deendayal Dinakarpandian,et al.  Finding disease similarity based on implicit semantic similarity , 2012, J. Biomed. Informatics.

[21]  Sudhir Kumar,et al.  Medical subject headings (MeSH) terms , 2014, Indian journal of orthopaedics.

[22]  Xing Chen,et al.  Long non-coding RNAs and complex diseases: from experimental results to computational models , 2016, Briefings Bioinform..

[23]  Yadong Wang,et al.  Constructing Networks of Organelle Functional Modules in Arabidopsis , 2016, Current genomics.

[24]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[25]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[26]  Xing Chen,et al.  IRWRLDA: improved random walk with restart for lncRNA-disease association prediction , 2016, Oncotarget.

[27]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[28]  Yang Li,et al.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations , 2013, Nucleic Acids Res..

[29]  Qionghai Dai,et al.  Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity , 2015, Scientific Reports.

[30]  Chunquan Li,et al.  Allele-Specific Behavior of Molecular Networks: Understanding Small-Molecule Drug Response in Yeast , 2013, PloS one.

[31]  Xing Chen,et al.  FMLNCSIM: fuzzy measure-based lncRNA functional similarity calculation model , 2016, Oncotarget.

[32]  Mohammed H. Sqalli,et al.  UCloud: A simulated Hybrid Cloud for a university environment , 2012, 2012 IEEE 1st International Conference on Cloud Networking (CLOUDNET).

[33]  Xiaoyan Liu,et al.  Measuring gene functional similarity based on group-wise comparison of GO terms , 2013, Bioinform..

[34]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[35]  Mauno Vihinen,et al.  Identification of candidate disease genes by integrating Gene Ontologies and protein-interaction networks: case study of primary immunodeficiencies , 2008, Nucleic acids research.

[36]  Lin Liu,et al.  Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network. , 2014, Molecular bioSystems.

[37]  Yue Jiang,et al.  DisSim: an online system for exploring significant similar diseases and exhibiting potential therapeutic drugs , 2016, Scientific Reports.

[38]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.