Discovering disease-disease associations by fusing systems-level molecular data

The advent of genome-scale genetic and genomic studies allows new insight into disease classification. Recently, a shift was made from linking diseases simply based on their shared genes towards systems-level integration of molecular data. Here, we aim to find relationships between diseases based on evidence from fusing all available molecular interaction and ontology data. We propose a multi-level hierarchy of disease classes that significantly overlaps with existing disease classification. In it, we find 14 disease-disease associations currently not present in Disease Ontology and provide evidence for their relationships through comorbidity data and literature curation. Interestingly, even though the number of known human genetic interactions is currently very small, we find they are the most important predictor of a link between diseases. Finally, we show that omission of any one of the included data sources reduces prediction quality, further highlighting the importance in the paradigm shift towards systems-level data fusion.

[1]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[2]  E. Keuter [Vitamin B complex deficiency causing the psychiatric symptoms of atypical endogenous depression]. , 1958, Nederlands tijdschrift voor geneeskunde.

[3]  B. Toone,et al.  Red cell folate concentrations in psychiatric patients. , 1990, Journal of affective disorders.

[4]  Gohara Af,et al.  Crescentic fibrillary glomerulonephritis associated with intermittent rifampin therapy for pulmonary tuberculosis. , 1994 .

[5]  A. Tefferi,et al.  Paraneoplastic cholestasis and hypercoagulability associated with medullary thyroid carcinoma. Resolution with tumor debulking , 1994, Cancer.

[6]  R. Hamilton,et al.  Crescentic fibrillary glomerulonephritis associated with intermittent rifampin therapy for pulmonary tuberculosis. , 1994, Clinical nephrology.

[7]  D. Heuss,et al.  Myopathological findings in interstitial myositis in type II polyendocrine autoimmune syndrome (Schmidt's syndrome). , 1995, Neurological research.

[8]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[9]  M. Dietrich,et al.  Low levels of fibrin-stabilizing factor (factor XIII) in human Plasmodium falciparum malaria: correlation with clinical severity. , 1999, The American journal of tropical medicine and hygiene.

[10]  J. Ioannidis,et al.  Clinical evolution, and morbidity and mortality of primary Sjögren's syndrome. , 2000, Seminars in arthritis and rheumatism.

[11]  Refior,et al.  Gemeinsames Auftreten eines bilateralen Paraganglioms der A. carotis, eines Thymoms und eines Schilddrüsenadenoms - Zufallsbefund? , 2000 .

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  Philip M. Kim,et al.  Subsystem identification through dimensionality reduction of large-scale gene expression data. , 2003, Genome research.

[14]  R. Wevers,et al.  Mutations in the sterol 27-hydoxylase gene (CYP27A) cause hepatitis of infancy as well as cerebrotendinous xanthomatosis , 2002, Journal of Inherited Metabolic Disease.

[15]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[16]  Suppression of Hodgkin's disease in a patient with Cushing's syndrome. , 2004, Journal of pediatric hematology/oncology.

[17]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Stuart J. Nelson,et al.  The MeSH Translation Maintenance System: Structure, Interface Design, and Implementation , 2004, MedInfo.

[19]  M. Copin,et al.  Glomérulonéphrite membrano-proliférative associée à une cryoglobulinémie de type II, révélant un lymphome gastrique de type MALT , 2006 .

[20]  Bart De Moor,et al.  Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks , 2006, ISMB.

[21]  F. Luft,et al.  Crescentic glomerulonephritis and malignancy--guilty or guilt by association? , 2006, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[22]  [Mixed cryoglobulinemia-associated membranoproliferative glomerulonephritis, disclosing gastric MALT lymphoma]. , 2006, Annales de pathologie.

[23]  F. Giuliano,et al.  Unbalanced inherited complex chromosome rearrangement involving chromosome 8, 10, 11 and 16 in a patient with congenital malformations and delayed development. , 2006, European journal of medical genetics.

[24]  Bassem A. Hassan,et al.  Gene prioritization through genomic data fusion , 2006, Nature Biotechnology.

[25]  Sherri de Coronado,et al.  NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information , 2007, J. Biomed. Informatics.

[26]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[27]  Tijl De Bie,et al.  Kernel-based data fusion for gene prioritization , 2007, ISMB/ECCB.

[28]  A. Barabasi,et al.  Human disease classification in the postgenomic era: A complex systems approach to human pathobiology , 2007, Molecular systems biology.

[29]  Nicolette de Keizer,et al.  Forty years of SNOMED: a literature review , 2008, BMC Medical Informatics Decis. Mak..

[30]  Carlos Prieto,et al.  Human Gene Coexpression Landscape: Confident Network Derived from Tissue Transcriptomic Profiles , 2008, PloS one.

[31]  Krin A. Kay,et al.  The implications of human metabolic network topology for disease comorbidity , 2008, Proceedings of the National Academy of Sciences.

[32]  E. Snitkin,et al.  Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network , 2009, Genome Biology.

[33]  W. Kibbe,et al.  Annotating the human genome with Disease Ontology , 2009, BMC Genomics.

[34]  Albert-László Barabási,et al.  A Dynamic Network Approach for the Study of Human Phenotypes , 2009, PLoS Comput. Biol..

[35]  Y. Wen,et al.  Crescentic glomerulonephritis associated with miliary tuberculosis. , 2009, Clinical nephrology.

[36]  L. Christopher‐Stine,et al.  Drug-related Myopathies of Which the Clinician Should Be Aware , 2010, Current rheumatology reports.

[37]  Johan A. K. Suykens,et al.  L2-norm multiple kernel learning and its application to biomedical data fusion , 2010, BMC Bioinformatics.

[38]  Ana Rath,et al.  WHO International Classification of Diseases (ICD) Revision Process: incorporating rare diseases into the classification scheme: state of art , 2010, Orphanet Journal of Rare Diseases.

[39]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[40]  Jennifer M. Rust,et al.  The BioGRID Interaction Database , 2011 .

[41]  A. Ashworth,et al.  Genetic Interactions in Cancer Progression and Treatment , 2011, Cell.

[42]  Michael J. Lush,et al.  genenames.org: the HGNC resources in 2011 , 2010, Nucleic Acids Res..

[43]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[44]  K. Buiting,et al.  Molecular and Clinical Aspects of Angelman Syndrome , 2011, Molecular Syndromology.

[45]  Carol A. Bocchini,et al.  A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®) , 2011, Human mutation.

[46]  T. Kleefstra,et al.  Adult Phenotypes in Angelman- and Rett-Like Syndromes , 2012, Molecular Syndromology.

[47]  R. Piro,et al.  Computational approaches to disease‐gene prediction: rationale, classification and successes , 2012, The FEBS journal.

[48]  Natasa Przulj,et al.  Biological function through network topology: a survey of the human diseasome , 2012, Briefings in functional genomics.

[49]  Quaid Morris,et al.  Combining many interaction networks to predict gene function and analyze gene lists , 2012, Proteomics.

[50]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[51]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[52]  The prevalence of dementia and depression in Taiwanese institutionalized leprosy patients, and the effectiveness evaluation of reminiscence therapy—a longitudinal, single‐blind, randomized control study , 2012, International journal of geriatric psychiatry.

[53]  Marcel J. T. Reinders,et al.  Integration of Clinical and Gene Expression Data Has a Synergetic Effect on Predicting Breast Cancer Outcome , 2012, PloS one.

[54]  Deok-Sun Lee,et al.  Viral Perturbations of Host Networks Reflect Disease Etiology , 2012, PLoS Comput. Biol..

[55]  Elspeth A. Bruford,et al.  Genenames.org: the HGNC resources in 2013 , 2012, Nucleic Acids Res..

[56]  B. Clarke,et al.  Coexisting primary hyperparathyroidism and sarcoidosis cause increased angiotensin-converting enzyme and decreased parathyroid hormone and phosphate levels. , 2013, The Journal of clinical endocrinology and metabolism.

[57]  C. Sismani,et al.  Clinical and molecular description of the prenatal diagnosis of a fetus with a maternally inherited microduplication 22q11.2 of 2.5 Mb. , 2013, Gene.

[58]  Weixiong Zhang,et al.  Integrative Analysis Using Module-Guided Random Forests Reveals Correlated Genetic Factors Related to Mouse Weight , 2013, PLoS Comput. Biol..

[59]  M. Dehmer,et al.  The human disease network , 2013 .

[60]  Xiangxue Wang An Integrative Multi-Network and Multi-Classifier Approach to Predict Genetic Interactions , 2015 .

[61]  Marinka Zitnik,et al.  Data Fusion by Matrix Factorization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.