Using ALoFT to determine the impact of putative loss-of-function variants in protein-coding genes

Variants predicted to result in the loss of function of human genes have attracted interest because of their clinical impact and surprising prevalence in healthy individuals. Here, we present ALoFT (annotation of loss-of-function transcripts), a method to annotate and predict the disease-causing potential of loss-of-function variants. Using data from Mendelian disease-gene discovery projects, we show that ALoFT can distinguish between loss-of-function variants that are deleterious as heterozygotes and those causing disease only in the homozygous state. Investigation of variants discovered in healthy populations suggests that each individual carries at least two heterozygous premature stop alleles that could potentially lead to disease if present as homozygotes. When applied to de novo putative loss-of-function variants in autism-affected families, ALoFT distinguishes between deleterious variants in patients and benign variants in unaffected siblings. Finally, analysis of somatic variants in >6500 cancer exomes shows that putative loss-of-function variants predicted to be deleterious by ALoFT are enriched in known driver genes.Variants causing loss of function (LoF) of human genes have clinical implications. Here, the authors present a method to predict disease-causing potential of LoF variants, ALoFT (annotation of Loss-of-Function Transcripts) and show its application to interpreting LoF variants in different contexts.

[1]  Andres Metspalu,et al.  Distribution and Medical Impact of Loss-of-Function Variants in the Finnish Founder Population , 2014, PLoS genetics.

[2]  J. O’Connell,et al.  A Null Mutation in Human APOC3 Confers a Favorable Plasma Lipid Profile and Apparent Cardioprotection , 2008, Science.

[3]  Kenny Q. Ye,et al.  De Novo Gene Disruptions in Children on the Autistic Spectrum , 2012, Neuron.

[4]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[5]  Karynne E. Patterson,et al.  The Genetic Basis of Mendelian Phenotypes: Discoveries, Challenges, and Opportunities. , 2015, American journal of human genetics.

[6]  J. Shendure,et al.  Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data , 2011, Nature Reviews Genetics.

[7]  Kosuke M. Teshima,et al.  Natural Selection on Genes that Underlie Human Disease Susceptibility , 2008, Current Biology.

[8]  Meenal Patel,et al.  PTC124 targets genetic disorders caused by nonsense mutations , 2007, Nature.

[9]  Peer Bork,et al.  SMART: recent updates, new developments and status in 2015 , 2014, Nucleic Acids Res..

[10]  Evan T. Geller,et al.  Patterns and rates of exonic de novo mutations in autism spectrum disorders , 2012, Nature.

[11]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[12]  Mark Gerstein,et al.  VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment , 2012, Bioinform..

[13]  Thomas Meitinger,et al.  Loss-of-function mutations in SLC30A8 protect against type 2 diabetes , 2014, Nature Genetics.

[14]  M. Akiyama,et al.  Compound heterozygotes for filaggrin gene mutations do not always show severe atopic dermatitis , 2017, Journal of the European Academy of Dermatology and Venereology : JEADV.

[15]  Norbert Gretz,et al.  miRWalk - Database: Prediction of possible miRNA binding sites by "walking" the genes of three genomes , 2011, J. Biomed. Informatics.

[16]  Harry Hemingway,et al.  Health and population effects of rare gene knockouts in adult humans with related parents , 2015, Science.

[17]  Mohamed Abouelhoda,et al.  Accelerating novel candidate gene discovery in neurogenetic disorders via whole-exome sequencing of prescreened multiplex consanguineous families. , 2015, Cell reports.

[18]  Seema M Jamal,et al.  Pathogenic variants for Mendelian and complex traits in exomes of 6,517 European and African Americans: implications for the return of incidental results. , 2014, American journal of human genetics.

[19]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[20]  Bradley P. Coe,et al.  Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations , 2012, Nature.

[21]  H. Stefánsson,et al.  Identification of a large set of rare complete human knockouts , 2015, Nature Genetics.

[22]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[23]  Gabor T. Marth,et al.  Integrative Annotation of Variants from 1092 Humans: Application to Cancer Genomics , 2013, Science.

[24]  P. Stankiewicz,et al.  Deletions of recessive disease genes: CNV contribution to carrier states and disease-causing alleles , 2013, Genome research.

[25]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[26]  Sven Bergmann,et al.  A higher mutational burden in females supports a "female protective model" in neurodevelopmental disorders. , 2014, American journal of human genetics.

[27]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[28]  Mark Gerstein,et al.  Gene inactivation and its implications for annotation in the era of personal genomics. , 2011, Genes & development.

[29]  Ioannis Xenarios,et al.  Analysis of Stop-Gain and Frameshift Variants in Human Innate Immunity Genes , 2014, bioRxiv.

[30]  Christopher S. Poultney,et al.  Synaptic, transcriptional, and chromatin genes disrupted in autism , 2014, Nature.

[31]  Joseph K. Pickrell,et al.  A Systematic Survey of Loss-of-Function Variants in Human Protein-Coding Genes , 2012, Science.

[32]  Deanna M. Church,et al.  ClinVar: public archive of relationships among sequence variation and human phenotype , 2013, Nucleic Acids Res..

[33]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[34]  Jessica X Chong,et al.  A population-based study of autosomal-recessive disease-causing mutations in a founder population. , 2012, American journal of human genetics.

[35]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[36]  Zoran Obradovic,et al.  The protein trinity—linking function and disorder , 2001, Nature Biotechnology.

[37]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[38]  Alexander Hanbo Li,et al.  Association of Rare Loss-Of-Function Alleles in HAL, Serum Histidine: Levels and Incident Coronary Heart Disease. , 2015, Circulation. Cardiovascular genetics.

[39]  M. Hentze,et al.  5-azacytidine inhibits nonsense-mediated decay in a MYC-dependent fashion , 2014, EMBO molecular medicine.

[40]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[41]  Eytan Ruppin,et al.  Model-based identification of drug targets that revert disrupted metabolism and its application to ageing , 2013, Nature Communications.

[42]  Vladimir Vacic,et al.  Disease-Associated Mutations Disrupt Functionally Important Regions of Intrinsic Protein Disorder , 2012, PLoS Comput. Biol..

[43]  Rachel Karchin,et al.  Next generation tools for the annotation of human SNPs , 2009, Briefings Bioinform..

[44]  P. Ng,et al.  Predicting the effects of frameshifting indels , 2012, Genome Biology.

[45]  James P Evans,et al.  An informatics approach to analyzing the incidentalome , 2012, Genetics in Medicine.

[46]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[47]  Bernard F. Buxton,et al.  The DISOPRED server for the prediction of protein disorder , 2004, Bioinform..

[48]  P. Stenson,et al.  The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine , 2013, Human Genetics.

[49]  R Bailén Almorox,et al.  [Effect of a monoclonal antibody to PCSK9 on LDL cholesterol]. , 2012, Revista clinica espanola.

[50]  Insuk Lee,et al.  Characterising and Predicting Haploinsufficiency in the Human Genome , 2010, PLoS genetics.

[51]  I. Adzhubei,et al.  Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2 , 2013, Current protocols in human genetics.

[52]  Gustavo Glusman,et al.  The complete human olfactory subgenome. , 2001, Genome research.

[53]  J. Cogan,et al.  Truncating and missense BMPR2 mutations differentially affect the severity of heritable pulmonary arterial hypertension , 2009, Respiratory research.

[54]  Jonathan C. Cohen,et al.  Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. , 2006, The New England journal of medicine.

[55]  C. Tyler-Smith,et al.  Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing. , 2012, American journal of human genetics.

[56]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[57]  J. Lupski,et al.  Molecular mechanism for distinct neurological phenotypes conveyed by allelic truncating mutations , 2004, Nature Genetics.

[58]  F. Alkuraya Human knockout research: new horizons and opportunities. , 2015, Trends in genetics : TIG.

[59]  M. Bucan,et al.  Promoter features related to tissue specificity as measured by Shannon entropy , 2005, Genome Biology.

[60]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[61]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[62]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[63]  L. Maquat,et al.  Quality control of eukaryotic mRNA: safeguarding cells from abnormal mRNA function. , 2007, Genes & development.

[64]  Ryan W. Kim,et al.  Carrier Testing for Severe Childhood Recessive Diseases by Next-Generation Sequencing , 2011, Science Translational Medicine.

[65]  Michael F. Walker,et al.  De novo mutations revealed by whole-exome sequencing are strongly associated with autism , 2012, Nature.

[66]  Serafim Batzoglou,et al.  Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++ , 2010, PLoS Comput. Biol..

[67]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[68]  M. Gerstein,et al.  The GENCODE pseudogene resource , 2012, Genome Biology.

[69]  A. Clark,et al.  Dissecting disease inheritance modes in a three-dimensional protein network challenges the "guilt-by-association" principle. , 2013, American journal of human genetics.

[70]  P. Stenson,et al.  Genes, mutations, and human inherited disease at the dawn of the age of personalized genomics , 2010, Human mutation.

[71]  C. Ballantyne,et al.  A 52-week placebo-controlled trial of evolocumab in hyperlipidemia. , 2014, The New England journal of medicine.