The road from next-generation sequencing to personalized medicine.

Moving from a traditional medical model of treating pathologies to an individualized predictive and preventive model of personalized medicine promises to reduce the healthcare cost on an overburdened and overwhelmed system. Next-generation sequencing (NGS) has the potential to accelerate the early detection of disorders and the identification of pharmacogenetics markers to customize treatments. This review explains the historical facts that led to the development of NGS along with the strengths and weakness of NGS, with a special emphasis on the analytical aspects used to process NGS data. There are solutions to all the steps necessary for performing NGS in the clinical context where the majority of them are very efficient, but there are some crucial steps in the process that need immediate attention.

[1]  Jaroslav Bendl,et al.  PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations , 2014, PLoS Comput. Biol..

[2]  Magalie S Leduc,et al.  Clinical whole-exome sequencing for the diagnosis of mendelian disorders. , 2013, The New England journal of medicine.

[3]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[4]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[5]  R. Altman,et al.  Collective judgment predicts disease-associated single nucleotide variants , 2013, BMC Genomics.

[6]  Douglas E. V. Pires,et al.  mCSM: predicting the effects of mutations in proteins using graph-based signatures , 2013, Bioinform..

[7]  Predrag Radivojac,et al.  Automated inference of molecular mechanisms of disease from amino acid substitutions , 2009, Bioinform..

[8]  Aleksandar Milosavljevic,et al.  An integrative variant analysis suite for whole exome next-generation sequencing data , 2012, BMC Bioinformatics.

[9]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Francis S Collins,et al.  A HapMap harvest of insights into the genetics of common disease. , 2008, The Journal of clinical investigation.

[11]  B. Peters,et al.  Distinguishing cancer-associated missense mutations from common polymorphisms. , 2007, Cancer research.

[12]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[13]  Ituro Inoue,et al.  Next-generation sequencing: impact of exome sequencing in characterizing Mendelian disorders , 2012, Journal of Human Genetics.

[14]  Ran Friedman,et al.  Molecular modelling and simulations in cancer research. , 2013, Biochimica et biophysica acta.

[15]  Shamil R Sunyaev,et al.  Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. , 2007, American journal of human genetics.

[16]  P. Friedman,et al.  Studies on the photodynamic reaction of purines and purine analogues with methylene blue. , 1968, Biochimica et biophysica acta.

[17]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[18]  Simon Kasif,et al.  topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association , 2004, Nucleic Acids Res..

[19]  E. Capriotti,et al.  Functional annotations improve the predictive score of human disease‐related mutations in proteins , 2009, Human mutation.

[20]  Mauno Vihinen,et al.  PON‐P: Integrated predictor for pathogenicity of missense variants , 2012, Human mutation.

[21]  Mark Yandell,et al.  VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix , 2013, Genetic epidemiology.

[22]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[23]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[24]  Daniel Nilsson,et al.  An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge , 2014, Genome Biology.

[25]  A. Gonzalez-Perez,et al.  Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. , 2011, American journal of human genetics.

[26]  Kenneth H. Buetow,et al.  Large-scale analysis of non-synonymous coding region single nucleotide polymorphisms , 2004, Bioinform..

[27]  E. Zeggini,et al.  Functional annotation of non-coding sequence variants , 2014, Nature Methods.

[28]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[29]  A. Kerlavage,et al.  Complementary DNA sequencing: expressed sequence tags and human genome project , 1991, Science.

[30]  Christian Gilissen,et al.  Disease gene identification strategies for exome sequencing , 2012, European Journal of Human Genetics.

[31]  Chia-Hung Liu,et al.  FASTSNP: an always up-to-date and extendable service for SNP function analysis and prioritization , 2006, Nucleic Acids Res..

[32]  Jeroen F. J. Laros,et al.  LOVD v.2.0: the next generation in gene variant databases , 2011, Human mutation.

[33]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[34]  A. Gonzalez-Perez,et al.  Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation , 2012, Genome Medicine.

[35]  B. Rost,et al.  SNAP: predict effect of non-synonymous polymorphisms on function , 2007, Nucleic acids research.

[36]  Pietro Liò,et al.  Prediction by Graph Theoretic Measures of Structural Effects in Proteins Arising from Non-Synonymous Single Nucleotide Polymorphisms , 2008, PLoS Comput. Biol..

[37]  Xiaoqing Yu,et al.  Comparing a few SNP calling algorithms using low-coverage sequencing data , 2013, BMC Bioinformatics.

[38]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[39]  C. Sander,et al.  Predicting the functional impact of protein mutations: application to cancer genomics , 2011, Nucleic acids research.

[40]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[41]  Meredith Wadman,et al.  James Watson's genome sequenced at high speed , 2008, Nature.

[42]  C. Cowled,et al.  Genetic architecture of gene expression in the chicken , 2013, BMC Genomics.

[43]  Michael Q. Zhang,et al.  Updates to the RMAP short-read mapping software , 2009, Bioinform..

[44]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[45]  Christian Gilissen,et al.  Unlocking Mendelian disease using exome sequencing , 2011, Genome Biology.

[46]  P. Stenson,et al.  The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine , 2013, Human Genetics.

[47]  E. Boerwinkle,et al.  dbNSFP v2.0: A Database of Human Non‐synonymous SNVs and Their Functional Predictions and Annotations , 2013, Human mutation.

[48]  F. Collins,et al.  First FDA authorization for next-generation sequencer. , 2013, The New England journal of medicine.

[49]  Faraz Hach,et al.  mrsFAST: a cache-oblivious algorithm for short-read mapping , 2010, Nature Methods.

[50]  François Stricher,et al.  The FoldX web server: an online force field , 2005, Nucleic Acids Res..

[51]  Yudi Pawitan,et al.  Revisiting Mendelian disorders through exome sequencing , 2011, Human Genetics.

[52]  A. Zharkikh,et al.  Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral , 2005, Journal of Medical Genetics.

[53]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[54]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[55]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[56]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[57]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[58]  Mi Zhou,et al.  nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms , 2005, Nucleic Acids Res..

[59]  W. Miller,et al.  PhenCode: connecting ENCODE data with mutations and phenotype , 2007, Human mutation.

[60]  J. Zook,et al.  Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls , 2013, Nature Biotechnology.

[61]  Johnny S. H. Kwan,et al.  Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies , 2013, PLoS genetics.

[62]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[63]  A. Sidow,et al.  Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. , 2005, Genome research.

[64]  Joaquín Dopazo,et al.  SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants , 2011, Nucleic Acids Res..

[65]  Davide Pirolli,et al.  Insight into a Novel p53 Single Point Mutation (G389E) by Molecular Dynamics Simulations , 2010, International journal of molecular sciences.

[66]  M. Gonzalez-Garay,et al.  Personalized genomic disease risk of volunteers , 2013, Proceedings of the National Academy of Sciences.

[67]  M. Gonzalez-Garay,et al.  Adult genetic risk screening. , 2014, Annual review of medicine.

[68]  Emidio Capriotti,et al.  Bioinformatics Original Paper Predicting the Insurgence of Human Genetic Diseases Associated to Single Point Protein Mutations with Support Vector Machines and Evolutionary Information , 2022 .

[69]  Ümit V. Çatalyürek,et al.  Benchmarking short sequence mapping tools , 2013, BMC Bioinformatics.

[70]  Robin B. Gasser,et al.  A hitchhiker's guide to expressed sequence tag (EST) analysis , 2006, Briefings Bioinform..

[71]  Joshua S. Paul,et al.  Genotype and SNP calling from next-generation sequencing data , 2011, Nature Reviews Genetics.

[72]  S. Tavtigian,et al.  In silico analysis of missense substitutions using sequence‐alignment based methods , 2008, Human mutation.

[73]  M. Vihinen,et al.  Performance of mutation pathogenicity prediction methods on missense variants , 2011, Human mutation.

[74]  Steven M. Johnson,et al.  A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. , 2008, Genome research.

[75]  Kenneth H. Buetow,et al.  Bioinformatics Applications Note Sequence Analysis Bambino: a Variant Detector and Alignment Viewer for Next-generation Sequencing Data in the Sam/bam Format , 2022 .

[76]  Mingming Jia,et al.  COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer , 2009, Nucleic Acids Res..

[77]  J. Miller,et al.  Predicting the Functional Effect of Amino Acid Substitutions and Indels , 2012, PloS one.

[78]  Wei Chen,et al.  Single Nucleotide Polymorphism (SNP) Detection and Genotype Calling from Massively Parallel Sequencing (MPS) Data , 2012, Statistics in Biosciences.

[79]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[80]  H. Hakonarson,et al.  SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data , 2011, Nucleic acids research.

[81]  Christian Schaefer,et al.  SNPdbe: constructing an nsSNP functional impacts database , 2011, Bioinform..

[82]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[83]  Philippe Bogaerts,et al.  Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0 , 2009, Bioinform..

[84]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..

[85]  Jun Guo,et al.  Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines , 2007, BMC Bioinformatics.

[86]  S. Turner,et al.  Real-Time DNA Sequencing from Single Polymerase Molecules , 2009, Science.

[87]  Pierre Baldi,et al.  An enhanced MITOMAP with a global mtDNA mutational phylogeny , 2006, Nucleic Acids Res..

[88]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[89]  Gert Vriend,et al.  Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces , 2010, BMC Bioinformatics.

[90]  Mauno Vihinen,et al.  VariBench: A Benchmark Database for Variations , 2013, Human mutation.

[91]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[92]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[93]  J. Shendure,et al.  Materials and Methods Som Text Figs. S1 and S2 Tables S1 to S4 References Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome , 2022 .

[94]  Olivier Poch,et al.  KD4v: comprehensible knowledge discovery system for missense variant , 2012, Nucleic Acids Res..

[95]  L. Stein 21.10 n&v 915 MH , 2004 .

[96]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[97]  Simon White,et al.  Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline , 2014, BMC Bioinformatics.

[98]  Nuno A. Fonseca,et al.  Tools for mapping high-throughput sequencing data , 2012, Bioinform..

[99]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[100]  Modesto Orozco,et al.  PMUT: a web-based tool for the annotation of pathological mutations on proteins , 2005, Bioinform..

[101]  R. Gibbs,et al.  Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology , 2012, PloS one.

[102]  V. Acharya,et al.  Hansa: An automated method for discriminating disease and neutral human nsSNPs , 2012, Human mutation.

[103]  J. Lupski,et al.  The complete genome of an individual by massively parallel DNA sequencing , 2008, Nature.

[104]  Fiona Cunningham,et al.  A Combined Functional Annotation Score for Non-Synonymous Variants , 2012, Human Heredity.

[105]  P. Stankiewicz,et al.  Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. , 2010, The New England journal of medicine.

[106]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[107]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[108]  Thomas Schlitt,et al.  Predicting the functional consequences of non-synonymous DNA sequence variants--evaluation of bioinformatics tools and development of a consensus strategy. , 2013, Genomics.

[109]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[110]  Nada Jabado,et al.  What can exome sequencing do for you? , 2011, Journal of Medical Genetics.

[111]  Bernard P. Puc,et al.  An integrated semiconductor device enabling non-optical genome sequencing , 2011, Nature.

[112]  Amgen punts on deCODE's genetics know-how , 2013, Nature Biotechnology.

[113]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[114]  Lydia E. Kavraki,et al.  Prediction of enzyme function based on 3D templates of evolutionarily important amino acids , 2008, BMC Bioinformatics.

[115]  Deanna M. Church,et al.  ClinVar: public archive of relationships among sequence variation and human phenotype , 2013, Nucleic Acids Res..

[116]  Peng Yue,et al.  SNPs3D: Candidate gene and SNP selection for association studies , 2006, BMC Bioinformatics.

[117]  Liqing Zhang,et al.  Quantitative prediction of the effect of genetic variation using hidden Markov models , 2014, BMC Bioinformatics.

[118]  M. Ronaghi,et al.  A Sequencing Method Based on Real-Time Pyrophosphate , 1998, Science.