Interpretable Clinical Genomics with a Likelihood Ratio Paradigm.

Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Likelihood ratios (LR) are statistics for summarizing diagnostic accuracy, providing a measure of how much more (or less) a patient with a disease has a particular test result compared to patients without the disease. Here, we present an approach to genomic diagnostics that exploits the LR framework to provide an estimate of (1) the posttest probability of candidate diagnoses; (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 cases reports comprising 262 Mendelian diseases, with the correct diagnosis having a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.

[1]  Clara Gaff,et al.  Diagnostic Impact and Cost-effectiveness of Whole-Exome Sequencing for Ambulant Children With Suspected Monogenic Conditions , 2017, JAMA pediatrics.

[2]  D. Goldstein,et al.  Genic Intolerance to Functional Variation and the Interpretation of Personal Genomes , 2013, PLoS genetics.

[3]  Bart De Moor,et al.  eXtasy: variant prioritization by genomic data fusion , 2013, Nature Methods.

[4]  Laurie D. Smith,et al.  A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases , 2015, Genome Medicine.

[5]  Sabine Van Huffel,et al.  Interval Coded Scoring: a toolbox for interpretable scoring systems , 2018, PeerJ Comput. Sci..

[6]  Alexander A. Morgan,et al.  Likelihood ratios for genome medicine , 2010, Genome Medicine.

[7]  Lin Yang,et al.  PhenoPro: a novel toolkit for assisting in the diagnosis of Mendelian disease , 2019, Bioinform..

[8]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[9]  Bale,et al.  Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology , 2015, Genetics in Medicine.

[10]  Tudor Groza,et al.  The Human Phenotype Ontology in 2017 , 2016, Nucleic Acids Res..

[11]  Damian Smedley,et al.  Next-generation diagnostics and disease-gene discovery with the Exomiser , 2015, Nature Protocols.

[12]  Daniele Merico,et al.  Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test , 2017, Genetics in Medicine.

[13]  B. Fernandez,et al.  Utility of whole‐exome sequencing for those near the end of the diagnostic odyssey: time to address gaps in care , 2015, Clinical genetics.

[14]  Carsten Bergmann,et al.  Loss of nephrocystin-3 function can cause embryonic lethality, Meckel-Gruber-like syndrome, situs inversus, and renal-hepatic-pancreatic dysplasia. , 2008, American journal of human genetics.

[15]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[16]  Chunlei Liu,et al.  ClinVar: improving access to variant interpretations and supporting evidence , 2017, Nucleic Acids Res..

[17]  A. Albert,et al.  On the use and computation of likelihood ratios in clinical chemistry. , 1982, Clinical chemistry.

[18]  A. Olry,et al.  Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database , 2019, European Journal of Human Genetics.

[19]  David R. FitzPatrick,et al.  Paediatric genomics: diagnosing rare disease in children , 2018, Nature Reviews Genetics.

[20]  Tetsuya Okazaki,et al.  Comparison of Causative Variant Prioritization Tools Using Next-generation Sequencing Data in Japanese Patients with Mendelian Disorders. , 2019, Yonago acta medica.

[21]  Damian Smedley,et al.  Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome , 2014, Science Translational Medicine.

[22]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[23]  P. Ng,et al.  Phen-Gen: combining phenotype and genotype to analyze rare disorders , 2014, Nature Methods.

[24]  Tsviya Olender,et al.  VarElect: the phenotype-based variation prioritizer of the GeneCards Suite , 2016, BMC Genomics.

[25]  P. Robinson,et al.  Marfan syndrome: an update of genetics, medical and surgical management , 2007, Heart.

[26]  S. Mundlos,et al.  The Human Phenotype Ontology , 2010, Clinical genetics.

[27]  Patrice Godard,et al.  PCAN: phenotype consensus analysis to support disease-gene association , 2016, BMC Bioinformatics.

[28]  Anna Lehman,et al.  The cost and diagnostic yield of exome sequencing for children with suspected genetic disorders: a benchmarking study , 2018, Genetics in Medicine.

[29]  S. Blankenberg,et al.  Dural ectasia in Loeys–Dietz syndrome: comprehensive study of 30 patients with a TGFBR1 or TGFBR2 mutation , 2014, Clinical genetics.

[30]  Isaac S Kohane,et al.  Artificial Intelligence in Healthcare , 2019, Artificial Intelligence and Machine Learning for Business for Non-Engineers.

[31]  Kenneth F Schulz,et al.  Refining clinical diagnosis with likelihood ratios , 2005, The Lancet.

[32]  Damian Smedley,et al.  Improved exome prioritization of disease genes through cross-species phenotype comparison , 2014, Genome research.

[33]  Hui Yang,et al.  Phenolyzer: phenotype-based prioritization of candidate genes for human diseases , 2015, Nature Methods.

[34]  Damian Smedley,et al.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data , 2014, Nucleic Acids Res..

[35]  Tudor Groza,et al.  Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources , 2018, Nucleic Acids Res..

[36]  Carmela Bravaccio,et al.  Seizures in children with neurofibromatosis type 1: is neurofibromatosis type 1 enough? , 2018, Italian Journal of Pediatrics.

[37]  Marcel H. Schulz,et al.  Clinical diagnostics in human genetics with semantic similarity searches in ontologies. , 2009, American journal of human genetics.

[38]  R. Srinivasan,et al.  Phenotype-driven gene prioritization for rare diseases using graph convolution on heterogeneous networks , 2018, BMC Medical Genomics.

[39]  Murat Sincan,et al.  Detecting false‐positive signals in exome sequencing , 2012, Human mutation.

[40]  Peter N. Robinson,et al.  Phenotype-driven strategies for exome prioritization of human Mendelian disease genes , 2015, Genome Medicine.

[41]  M Super,et al.  A clinical study of type 1 neurofibromatosis in north west England , 1999, Journal of medical genetics.

[42]  J. Kassirer,et al.  Therapeutic decision making: a cost-benefit analysis. , 1975, The New England journal of medicine.

[43]  Daniel R. Richards,et al.  Leveraging network analytics to infer patient syndrome and identify causal genes in rare disease cases , 2017, BMC Genomics.

[44]  D. Altman,et al.  Diagnostic tests 4: likelihood ratios , 2004, BMJ : British Medical Journal.

[45]  Brett J. Kennedy,et al.  Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. , 2014, American journal of human genetics.

[46]  Matthew N. Bainbridge,et al.  A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics , 2016, Genome Medicine.

[47]  Peter N. Robinson,et al.  Introduction to Bio-Ontologies , 2011 .

[48]  Rosario M. Piro,et al.  Computational Exome and Genome Analysis , 2017 .

[49]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[50]  Desheng Liang,et al.  Three Novel Mutations in FBN1 and TGFBR2 in Patients with the Syndromic Form of Thoracic Aortic Aneurysms and Dissections. , 2018, International heart journal.

[51]  Deborah Nickerson,et al.  Ataxia-Pancytopenia Syndrome Is Caused by Missense Mutations in SAMD9L. , 2016, American journal of human genetics.