A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases

Exome sequencing strategy is promising for finding novel mutations of human monogenic disorders. However, pinpointing the casual mutation in a small number of samples is still a big challenge. Here, we propose a three-level filtration and prioritization framework to identify the casual mutation(s) in exome sequencing studies. This efficient and comprehensive framework successfully narrowed down whole exome variants to very small numbers of candidate variants in the proof-of-concept examples. The proposed framework, implemented in a user-friendly software package, named KGGSeq (http://statgenpro.psychiatry.hku.hk/kggseq), will play a very useful role in exome sequencing-based discovery of human Mendelian disease genes.

[1]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[2]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[3]  David Haussler,et al.  New Methods for Detecting Lineage-Specific Selection , 2006, RECOMB.

[4]  M. Boehnke,et al.  Limits of resolution of genetic linkage studies: implications for the positional cloning of human disease genes. , 1994, American journal of human genetics.

[5]  Nada Jabado,et al.  What can exome sequencing do for you? , 2011, Journal of Medical Genetics.

[6]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[7]  Mark I McCarthy,et al.  Exploring the unknown: assumptions about allelic architecture and strategies for susceptibility variant discovery , 2009, Genome Medicine.

[8]  E. Boerwinkle,et al.  dbNSFP v2.0: A Database of Human Non‐synonymous SNVs and Their Functional Predictions and Annotations , 2013, Human mutation.

[9]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[10]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[11]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Mark I McCarthy,et al.  Learning From Molecular Genetics , 2008, Diabetes.

[13]  B. Browning,et al.  A fast, powerful method for detecting identity by descent. , 2011, American journal of human genetics.

[14]  M. Oti,et al.  The modular nature of genetic diseases , 2006, Clinical genetics.

[15]  C. Férec,et al.  Revealing the human mutome , 2010, Clinical genetics.

[16]  E. Boerwinkle,et al.  dbNSFP: A Lightweight Database of Human Nonsynonymous SNPs and Their Functional Predictions , 2011, Human mutation.

[17]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[18]  A. Gonzalez-Perez,et al.  Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. , 2011, American journal of human genetics.

[19]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[20]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[21]  Justin C. Fay,et al.  Identification of deleterious mutations within three human genomes. , 2009, Genome research.

[22]  Amy Maxmen,et al.  Exome Sequencing Deciphers Rare Diseases , 2011, Cell.

[23]  I. Tikhonova,et al.  Genetic diagnosis by whole exome capture and massively parallel DNA sequencing , 2009, Proceedings of the National Academy of Sciences.

[24]  Stylianos E. Antonarakis,et al.  Mendelian disorders deserve more attention , 2006, Nature Reviews Genetics.

[25]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[26]  A. Barabasi,et al.  A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration , 2006, Cell.

[27]  Sebastian Bauer,et al.  Identity-by-descent filtering of exome sequence data for disease–gene identification in autosomal recessive disorders , 2011, Bioinform..

[28]  E. Alexov,et al.  Approaches and resources for prediction of the effects of non-synonymous single nucleotide polymorphism on protein function and interactions. , 2008, Current pharmaceutical biotechnology.

[29]  S. Henikoff,et al.  Predicting the effects of amino acid substitutions on protein function. , 2006, Annual review of genomics and human genetics.

[30]  A. Hoischen,et al.  Exome sequencing identifies truncating mutations in human SERPINF1 in autosomal-recessive osteogenesis imperfecta. , 2011, American journal of human genetics.

[31]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[32]  A. Sparks,et al.  The Genomic Landscapes of Human Breast and Colorectal Cancers , 2007, Science.