The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group.

We present the first Korean individual genome sequence (SJK) and analysis results. The diploid genome of a Korean male was sequenced to 28.95-fold redundancy using the Illumina paired-end sequencing method. SJK covered 99.9% of the NCBI human reference genome. We identified 420,083 novel single nucleotide polymorphisms (SNPs) that are not in the dbSNP database. Despite a close similarity, significant differences were observed between the Chinese genome (YH), the only other Asian genome available, and SJK: (1) 39.87% (1,371,239 out of 3,439,107) SNPs were SJK-specific (49.51% against Venter's, 46.94% against Watson's, and 44.17% against the Yoruba genomes); (2) 99.5% (22,495 out of 22,605) of short indels (< 4 bp) discovered on the same loci had the same size and type as YH; and (3) 11.3% (331 out of 2920) deletion structural variants were SJK-specific. Even after attempting to map unmapped reads of SJK to unanchored NCBI scaffolds, HGSV, and available personal genomes, there were still 5.77% SJK reads that could not be mapped. All these findings indicate that the overall genetic differences among individuals from closely related ethnic groups may be significant. Hence, constructing reference genomes for minor socio-ethnic groups will be useful for massive individual genome sequencing.

[1]  F. Sanger,et al.  Nucleotide sequence of bacteriophage φX174 DNA , 1977, Nature.

[2]  F. Sanger,et al.  Sequence and organization of the human mitochondrial genome , 1981, Nature.

[3]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[4]  L. Cavalli-Sforza,et al.  High resolution of human evolutionary trees with polymorphic microsatellites , 1994, Nature.

[5]  Shamkant B. Navathe,et al.  MITOMAP: a human mitochondrial genome database--1998 update , 1998, Nucleic Acids Res..

[6]  L. Cavalli-Sforza,et al.  Multilocus genotypes, a tree of individuals, and human evolutionary history. , 1997, American journal of human genetics.

[7]  D. Turnbull,et al.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA , 1999, Nature Genetics.

[8]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[9]  M. Hammer,et al.  Paternal population history of East Asia: sources, patterns, and microevolutionary processes. , 2001, American journal of human genetics.

[10]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[11]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[12]  M. Hammer,et al.  Y-chromosomal DNA haplogroups and their implications for the dual origins of the Koreans , 2003, Human Genetics.

[13]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[14]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[15]  S. Kashimura,et al.  Multiplex amplified product‐length polymorphism analysis of 36 mitochondrial single‐nucleotide polymorphisms for haplogrouping of East Asian populations , 2005, Electrophoresis.

[16]  Shamkant B. Navathe,et al.  MITOMAP: a human mitochondrial genome database—2004 update , 2004, Nucleic Acids Res..

[17]  G. Church,et al.  The Personal Genome Project , 2005, Molecular systems biology.

[18]  Myung Jin Park,et al.  East Asian mtDNA haplogroup determination in Koreans: Haplogroup‐level coding region SNP analysis and subhaplogroup‐level control region sequence analysis , 2006, Electrophoresis.

[19]  Q. Kong,et al.  Updating the East Asian mtDNA phylogeny: a prerequisite for the identification of pathogenic mutations. , 2006, Human molecular genetics.

[20]  M. Hammer,et al.  Dual origins of the Japanese: common ground for hunter-gatherer and farmer Y chromosomes , 2006, Journal of Human Genetics.

[21]  M. Nei,et al.  MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. , 2007, Molecular biology and evolution.

[22]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[23]  S. Batzoglou,et al.  Whole-Genome Sequencing and Assembly with High-Throughput, Short-Read Technologies , 2007, PloS one.

[24]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[25]  J. Lupski,et al.  The complete genome of an individual by massively parallel DNA sequencing , 2008, Nature.

[26]  Andreas von Bubnoff,et al.  Next-Generation Sequencing: The Race Is On , 2008, Cell.

[27]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[28]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[29]  Joshua M. Korn,et al.  Mapping and sequencing of structural variation from eight human genomes , 2008, Nature.

[30]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[31]  Amy E. Hawkins,et al.  DNA sequencing of a cytogenetically normal acute myeloid leukemia genome , 2008, Nature.

[32]  E. Mardis Next-generation DNA sequencing methods. , 2008, Annual review of genomics and human genetics.

[33]  Dawei Li,et al.  The diploid genome sequence of an Asian individual , 2008, Nature.

[34]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[35]  C. Tyler-Smith,et al.  The Peopling of Korea Revealed by Analyses of Mitochondrial DNA and Y-Chromosomal Markers , 2009, PloS one.

[36]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..