Personal genome sequencing: current approaches and challenges.

The revolution in DNA sequencing technologies has now made it feasible to determine the genome sequences of many individuals; i.e., "personal genomes." Genome sequences of cells and tissues from both normal and disease states have been determined. Using current approaches, whole human genome sequences are not typically assembled and determined de novo, but, instead, variations relative to a reference sequence are identified. We discuss the current state of personal genome sequencing, the main steps involved in determining a genome sequence (i.e., identifying single-nucleotide polymorphisms [SNPs] and structural variations [SVs], assembling new sequences, and phasing haplotypes), and the challenges and performance metrics for evaluating the accuracy of the reconstruction. Finally, we consider the possible individual and societal benefits of personal genome sequences.

[1]  Elisa Rossi,et al.  Epidermal growth factor receptor gene and protein and gefitinib sensitivity in non-small-cell lung cancer. , 2005, Journal of the National Cancer Institute.

[2]  Ncbi National Center for Biotechnology Information , 2008 .

[3]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[4]  Jay Shendure,et al.  Long-range polony haplotyping of individual human chromosome molecules , 2006, Nature Genetics.

[5]  Amy E. Hawkins,et al.  DNA sequencing of a cytogenetically normal acute myeloid leukemia genome , 2008, Nature.

[6]  Mark Gerstein,et al.  Genomic Anonymity: Have We Already Lost It? , 2008, The American journal of bioethics : AJOB.

[7]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[8]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[9]  G. Church,et al.  From genetic privacy to open consent , 2008, Nature Reviews Genetics.

[10]  Alexander Eckehart Urban,et al.  High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[11]  C. Nusbaum,et al.  ALLPATHS: de novo assembly of whole-genome shotgun microreads. , 2008, Genome research.

[12]  Francisco M. De La Vega,et al.  Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. , 2009, Genome research.

[13]  Thomas D. Wu,et al.  A highly annotated whole-genome sequence of a Korean individual , 2009, Nature.

[14]  Sangsoo Kim,et al.  The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. , 2009, Genome research.

[15]  Dmitry Pushkarev,et al.  Single-molecule sequencing of an individual human genome , 2009, Nature Biotechnology.

[16]  Russell Schwartz,et al.  Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem , 2002, Briefings Bioinform..

[17]  Liuda Ziaugra,et al.  SNP Genotyping Using the Sequenom MassARRAY iPLEX Platform , 2009, Current protocols in human genetics.

[18]  Mark Gerstein,et al.  Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants , 2009, PLoS Comput. Biol..

[19]  F. Collins,et al.  A vision for the future of genomics research , 2003, Nature.

[20]  Huanming Yang,et al.  SNP detection for massively parallel whole-genome resequencing. , 2009, Genome research.

[21]  Philip M. Kim,et al.  Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome , 2007, Science.

[22]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[23]  Michael Egmont-Petersen,et al.  Genome-wide Copy Number Profiling on High-density Bacterial Artificial Chromosomes, Single-nucleotide Polymorphisms, and Oligonucleotide Microarrays: A Platform Comparison based on Statistical Power Analysis , 2007, DNA research : an international journal for rapid publication of reports on genes and genomes.

[24]  J. Lupski,et al.  The complete genome of an individual by massively parallel DNA sequencing , 2008, Nature.

[25]  Hugo Y. K. Lam,et al.  Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library , 2010, Nature Biotechnology.

[26]  A. Halpern,et al.  An MCMC algorithm for haplotype assembly from whole-genome sequence data. , 2008, Genome research.

[27]  Christopher P Austin,et al.  Prepublication data sharing , 2009, Nature.

[28]  Ken Chen,et al.  Recurring mutations found by sequencing an acute myeloid leukemia genome. , 2009, The New England journal of medicine.

[29]  Tamer Kahveci,et al.  A novel genome-scale repeat finder geared towards transposons , 2008, Bioinform..

[30]  Mark Gerstein,et al.  MSB: a mean-shift-based approach for the analysis of structural variation in the genome. , 2008, Genome research.

[31]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[32]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[33]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[34]  E Birney,et al.  Prepublication data sharing: Benefits and Best Practices of Rapid Pre-Publication Data Release , 2009 .

[35]  Patrice M. Milos,et al.  Single-molecule sequencing: sequence methods to enable accurate quantitation. , 2010, Methods in enzymology.

[36]  K. Mossman X Prize Foundation, Santa Monica, Calif. , 2008 .

[37]  Tom Royce,et al.  A comprehensive catalogue of somatic mutations from a human cancer genome , 2010, Nature.

[38]  Dawei Li,et al.  The sequence and de novo assembly of the giant panda genome , 2010, Nature.

[39]  M. Gerstein,et al.  PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data , 2009, Genome Biology.

[40]  Kenny Q. Ye,et al.  Sensitive and accurate detection of copy number variants using read depth of coverage. , 2009, Genome research.

[41]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[42]  E. Birney,et al.  A small cell lung cancer genome reports complex tobacco exposure signatures , 2009, Nature.

[43]  Joshua M. Korn,et al.  Mapping and sequencing of structural variation from eight human genomes , 2008, Nature.

[44]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[45]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[46]  Adam M. Phillippy,et al.  Comparative genome assembly , 2004, Briefings Bioinform..

[47]  Dawei Li,et al.  The diploid genome sequence of an Asian individual , 2008, Nature.

[48]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[49]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[50]  Mark Gerstein,et al.  Personal phenotypes to go with personal genomes , 2009, Molecular systems biology.

[51]  George Newport,et al.  The diploid genome sequence of Candida albicans. , 2004, Proceedings of the National Academy of Sciences of the United States of America.