RDAD: A Machine Learning System to Support Phenotype-Based Rare Disease Diagnosis

DNA sequencing has allowed for the discovery of the genetic cause for a considerable number of diseases, paving the way for new disease diagnostics. However, due to the lack of clinical samples and records, the molecular cause for rare diseases is always hard to identify, significantly limiting the number of rare Mendelian diseases diagnosed through sequencing technologies. Clinical phenotype information therefore becomes a major resource to diagnose rare diseases. In this article, we adopted both a phenotypic similarity method and a machine learning method to build four diagnostic models to support rare disease diagnosis. All the diagnostic models were validated using the real medical records from RAMEDIS. Each model provides a list of the top 10 candidate diseases as the prediction outcome and the results showed that all models had a high diagnostic precision (≥98%) with the highest recall reaching up to 95% while the models with machine learning methods showed the best performance. To promote effective diagnosis for rare disease in clinical application, we developed the phenotype-based Rare Disease Auxiliary Diagnosis system (RDAD) to assist clinicians in diagnosing rare diseases with the above four diagnostic models. The system is freely accessible through http://www.unimd.org/RDAD/.

[1]  F E Masarie,et al.  Quick medical reference (QMR) for diagnostic assistance. , 1986, M.D.Computing.

[2]  G. Barnett,et al.  DXplain. An evolving diagnostic decision-support system. , 1987, JAMA.

[3]  H R Warner Iliad: moving medical decision-making into new frontiers. , 1989, Methods of information in medicine.

[4]  M. Bonten,et al.  treatment of , 2004 .

[5]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[6]  J. Wester A rare disease , 2007, Critical care.

[7]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[8]  Marcel H. Schulz,et al.  Clinical diagnostics in human genetics with semantic similarity searches in ontologies. , 2009, American journal of human genetics.

[9]  Manuel Corpas,et al.  DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. , 2009, American journal of human genetics.

[10]  Ralf Hofestädt,et al.  RAMEDIS: a comprehensive information system for variations and corresponding phenotypes of rare metabolic diseases , 2010, Human mutation.

[11]  Ole Winther,et al.  FindZebra: A search engine for rare diseases , 2013, Int. J. Medical Informatics.

[12]  Pedro Franco,et al.  Orphan drugs: the regulatory environment. , 2013, Drug discovery today.

[13]  Michael Brudno,et al.  PhenoTips: Patient Phenotyping Software for Clinical and Research Use , 2013, Human mutation.

[14]  Rong Xu,et al.  Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature , 2013, Bioinform..

[15]  Hui Yang,et al.  Phenolyzer: phenotype-based prioritization of candidate genes for human diseases , 2015, Nature Methods.

[16]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[17]  Albert Sorribas,et al.  Computer-assisted initial diagnosis of rare diseases , 2016, PeerJ.

[18]  Chang Su,et al.  Genome-wide analysis of differential DNA methylation in Silver-Russell syndrome , 2017, Science China Life Sciences.

[19]  Sean Ekins,et al.  Industrializing rare disease therapy discovery and development , 2017, Nature Biotechnology.

[20]  Valérie Lanneau,et al.  Clinical Practice Guidelines for Rare Diseases: The Orphanet Database , 2017, PloS one.

[21]  Rui Alves,et al.  Rare Disease Discovery: An Optimized Disease Ranking System , 2017, IEEE Transactions on Industrial Informatics.

[22]  Tieliu Shi,et al.  Towards efficiency in rare disease research: what is distinctive and important? , 2017, Science China Life Sciences.

[23]  Qian Fu,et al.  Whole-exome sequencing identified compound heterozygous variants in MMKS in a Chinese pedigree with Bardet-Biedl syndrome , 2017, Science China Life Sciences.

[24]  Yue Ming,et al.  PedAM: a database for Pediatric Disease Annotation and Medicine , 2017, Nucleic Acids Res..

[25]  Peter N. Robinson,et al.  Harmonising phenomics information for a better interoperability in the rare disease field. , 2018, European journal of medical genetics.

[26]  Wei Li,et al.  eRAM: encyclopedia of rare disease annotations for precision medicine , 2017, Nucleic Acids Res..