White paper: The Helix Pathogenicity Prediction Platform

In this white paper we introduce Helix, an AI based solution for missense pathogenicity prediction. With recent advances in the sequencing of human genomes, massive amounts of genetic data have become available. This has shifted the burden of labor for genetic diagnostics and research from the gathering of data to its interpretation. Helix presents a state of the art platform for pathogenicity prediction of human missense variants. In addition to offering best-in-class predictive performance, Helix offers a platform that allows researchers to analyze and interpret variants in depth that can be accessed at helixlabs.ai.

[1]  Gregory M. Cooper,et al.  CADD: predicting the deleteriousness of variants throughout the human genome , 2018, Nucleic Acids Res..

[2]  Chunlei Liu,et al.  ClinVar: improving access to variant interpretations and supporting evidence , 2017, Nucleic Acids Res..

[3]  P. Stenson,et al.  Human Gene Mutation Database (HGMD , 2003 .

[4]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[5]  Trevor Hastie,et al.  REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. , 2016, American journal of human genetics.

[6]  H. Carter,et al.  Identifying Mendelian disease genes with the Variant Effect Scoring Tool , 2013, BMC Genomics.

[7]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[8]  Joseph D. Janizek,et al.  Accurate classification of BRCA1 variants with saturation genome editing , 2018, Nature.

[9]  Stephan Heijl,et al.  Mind the gap: preventing circularity in missense variant prediction , 2020, bioRxiv.

[10]  Gert Vriend,et al.  3DM: Systematic analysis of heterogeneous superfamily data to discover protein functionalities , 2010, Proteins.

[11]  B. Rost,et al.  ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing , 2020, bioRxiv.

[12]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[13]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[14]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[15]  Adam C. Gunning,et al.  Assessing performance of pathogenicity predictors using clinically relevant variant datasets , 2020, Journal of Medical Genetics.