DeCoaD: determining correlations among diseases using protein interaction networks

BackgroundDisease–disease similarities can be investigated from multiple perspectives. Identifying similar diseases based on the underlying biomolecular interactions can be especially useful, because it may shed light on the common causes of the diseases and therefore may provide clues for possible treatments. Here we introduce DeCoaD, a web-based program that uses a novel method to assign pair-wise similarity scores, called correlations, to genetic diseases.FindingsDeCoaD uses a random walk to model the flow of information in a network within which nodes are either diseases or proteins and links signify either protein–protein interactions or disease–protein associations. For each protein node, the total number of visits by the random walker is called the weight of that node. Using a disease as both the starting and the terminating points of the random walks, a corresponding vector, whose elements are the weights associated with the proteins, can be constructed. The similarity between two diseases is defined as the cosine of the angle between their associated vectors. For a user-specified disease, DeCoaD outputs a list of similar diseases (with their corresponding correlations), and a graphical representation of the disease families that they belong to. Based on a probabilistic clustering algorithm, DeCoaD also outputs the clusters that the disease of interest is a member of, and the corresponding probabilities. The program also provides an interface to run enrichment analysis for the given disease or for any of the clusters that contains it.ConclusionsDeCoaD uses a novel algorithm to suggest non-trivial similarities between diseases with known gene associations, and also clusters the diseases based on their similarity scores. DeCoaD is available at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/mn/DeCoaD/.

[1]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[2]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[3]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[4]  Howard L. Bleich,et al.  Technical Milestone: Medical Subject Headings Used to Search the Biomedical Literature , 2001, J. Am. Medical Informatics Assoc..

[5]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[6]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[7]  Mark L. Johnson,et al.  Diseases of Wnt signaling , 2007, Reviews in Endocrine and Metabolic Disorders.

[8]  Aleksandar Stojmirovic,et al.  Information Flow in Interaction Networks , 2011, J. Comput. Biol..

[9]  Ian M. Donaldson,et al.  iRefIndex: A consolidated protein interaction database with provenance , 2008, BMC Bioinformatics.

[10]  Jagdish Chandra Patra,et al.  Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network , 2010, Bioinform..

[11]  Aleksandar Stojmirovic,et al.  Robust and accurate data enrichment statistics via distribution function of sum of weights , 2010, Bioinform..

[12]  Xiang Li,et al.  DOSim: An R package for similarity between diseases based on Disease Ontology , 2011, BMC Bioinformatics.

[13]  Aleksandar Stojmirovic,et al.  ppiTrim: constructing non-redundant and up-to-date interactomes , 2011, Database J. Biol. Databases Curation.

[14]  A. Bauer-Mehren,et al.  Gene-Disease Network Analysis Reveals Functional Modules in Mendelian, Complex and Environmental Diseases , 2011, PloS one.

[15]  Carol A. Bocchini,et al.  A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®) , 2011, Human mutation.

[16]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[17]  Aleksandar Stojmirovic,et al.  Information Flow in Interaction Networks II: Channels, Path Lengths, and Potentials , 2012, J. Comput. Biol..

[18]  Thomas C. Wiegers,et al.  MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database , 2012, Database J. Biol. Databases Curation.

[19]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2013 , 2012, Nucleic Acids Res..

[20]  Doron Lancet,et al.  MalaCards: an integrated compendium for diseases and their annotation , 2013, Database J. Biol. Databases Curation.

[21]  B. Zupan,et al.  Discovering disease-disease associations by fusing systems-level molecular data , 2013, Scientific Reports.

[22]  Andrey Rzhetsky,et al.  DiseaseConnect: a comprehensive web server for mechanism-based disease–disease connections , 2014, Nucleic Acids Res..

[23]  Yi-Kuo Yu,et al.  Relating Diseases by Integrating Gene Associations and Information Flow through Protein Interaction Network , 2014, PloS one.

[24]  Jiajie Peng,et al.  SemFunSim: A New Method for Measuring Disease Similarity by Integrating Semantic and Gene Functional Association , 2014, PloS one.