Variation of gene-based SNPs and linkage disequilibrium patterns in the human genome.

A principal goal in human genetics is to provide the tools necessary to enable genome-wide association studies. Extensive information on the distribution of gene-based single-nucleotide polymorphisms (SNPs) and linkage disequilibrium (LD) patterns across the genome is required in order to choose markers for efficient implementation of this approach. To obtain such information, we have genotyped a large Japanese cohort for SNPs identified by systematic resequencing of more than 14 000 autosomal genes. Analysis of these data led to the conclusion that the Japanese population contains approximately 130 000 common autosomal gene haplotypes (frequency >0.05), of which more than 35% are identified in the present study. We also examined allele frequencies and LD patterns according to the position of variants within genes, and their distribution across the genome. We found lower allele variability at exonic SNP sites (both non-synonymous and synonymous) compared with non-exonic SNP sites, and greater average LD between SNPs within exons of the same gene compared with other SNP combinations, both of which could be signals of selection. LD was correlated with the recombination rate per physical distance as estimated from the meiotic map, but the strength of the relationship varied considerably in different regions of the genome. Unique LD patterns, characterized by frequent instances of high LD between non-adjacent SNPs punctuated by blocks of low LD, were found in a 7 Mb region on chromosome 6p that includes the MHC (major histocompatibility complex) locus and many non-MHC genes. These results demonstrate the complexity that must be taken into account when considering SNP variability and LD patterns, while also providing tools necessary for implementation of efficient genome-wide association studies.

[1]  R. Lewontin,et al.  On measures of gametic disequilibrium. , 1988, Genetics.

[2]  S. Pääbo,et al.  Mitochondrial genome variation and the origin of modern humans , 2000, Nature.

[3]  M. Bamshad,et al.  Using mitochondrial and nuclear DNA markers to reconstruct human evolution , 1998, BioEssays : news and reviews in molecular, cellular and developmental biology.

[4]  B. J. Carey,et al.  Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots , 2003, Nature Genetics.

[5]  Yusuke Nakamura,et al.  Functional haplotypes of PADI4, encoding citrullinating enzyme peptidylarginine deiminase 4, are associated with rheumatoid arthritis , 2003, Nature Genetics.

[6]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[7]  Pardis C Sabeti,et al.  Detecting recent positive selection in the human genome from haplotype structure , 2002, Nature.

[8]  Michael Cullen,et al.  An integrated haplotype map of the human major histocompatibility complex. , 2003, American journal of human genetics.

[9]  A. Jeffreys,et al.  Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex , 2001, Nature Genetics.

[10]  Lon R. Cardon,et al.  A first-generation linkage disequilibrium map of human chromosome 22 , 2002, Nature.

[11]  Yusuke Nakamura,et al.  A high-throughput SNP typing system for genome-wide association studies , 2001, Journal of Human Genetics.

[12]  Hiroshi Sato,et al.  Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction , 2002, Nature Genetics.

[13]  Yusuke Nakamura,et al.  Analysis of single-nucleotide polymorphisms in Japanese rheumatoid arthritis patients shows additional susceptibility markers besides the classic shared epitope susceptibility sequences. , 2004, Arthritis and rheumatism.

[14]  P. Green,et al.  Analysis of expressed sequence tags indicates 35,000 human genes , 2000, Nature Genetics.

[15]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[16]  J. Pritchard,et al.  Linkage disequilibrium in humans: models and data. , 2001, American journal of human genetics.

[17]  Dana C Crawford,et al.  Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. , 2004, American journal of human genetics.

[18]  Pui-Yan Kwok,et al.  Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28 , 2000, Nature Genetics.

[19]  C. Fizames,et al.  Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence , 2000, Nature Genetics.

[20]  L. Cardon,et al.  Association study designs for complex diseases , 2001, Nature Reviews Genetics.

[21]  Kenneth H Buetow,et al.  Genomewide distribution of high-frequency, completely mismatching SNP haplotype pairs observed to be common across human populations. , 2003, American journal of human genetics.

[22]  W. Klitz,et al.  High-resolution HLA class I typing in the CEPH families: analysis of linkage disequilibrium among HLA loci. , 2000, Tissue antigens.

[23]  Shigehiko Kanaya,et al.  Codon Usage and tRNA Genes in Eukaryotes: Correlation of Codon Usage Diversity with Translation Efficiency and with CG-Dinucleotide Usage as Assessed by Multivariate Analysis , 2001, Journal of Molecular Evolution.

[24]  Yusuke Nakamura,et al.  Gene-based SNP discovery as part of the Japanese Millennium Genome Project: identification of 190 562 genetic variations in the human genome , 2002, Journal of Human Genetics.

[25]  D. Gudbjartsson,et al.  A high-resolution recombination map of the human genome , 2002, Nature Genetics.

[26]  N. Shen,et al.  Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis , 1999, Nature Genetics.

[27]  Li Jin,et al.  Y chromosome sequence variation and the history of human populations , 2000, Nature Genetics.

[28]  L R Cardon,et al.  Extent and distribution of linkage disequilibrium in three genomic regions. , 2001, American journal of human genetics.

[29]  T. Ohta,et al.  On some principles governing molecular evolution. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[31]  D. Goldstein Islands of linkage disequilibrium , 2001, Nature Genetics.

[32]  D. Hartl,et al.  Principles of population genetics , 1981 .

[33]  Yusuke Nakamura,et al.  Association between single-nucleotide polymorphisms in selectin genes and immunoglobulin A nephropathy. , 2002, American journal of human genetics.

[34]  I. Dunham,et al.  DNA sequence and analysis of human chromosome 9 , 2003, Nature.

[35]  M. Carrington,et al.  Discordant patterns of linkage disequilibrium of the peptide-transporter loci within the HLA class II region. , 1995, American journal of human genetics.

[36]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[37]  M. Cargill Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999, Nature Genetics.

[38]  Yusuke Nakamura,et al.  An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid arthritis , 2003, Nature Genetics.

[39]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[40]  I. Dunham,et al.  The DNA sequence and analysis of human chromosome 6 , 2003, Nature.

[41]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999 .

[42]  W. Klitz,et al.  Polymorphism, recombination, and linkage disequilibrium within the HLA class II region. , 1992, Journal of immunology.

[43]  Yozo Ohnishi,et al.  [A high-throughput SNP typing system for genome-wide association studies]. , 2002, Gan to kagaku ryoho. Cancer & chemotherapy.

[44]  Rolf Hilfiker,et al.  The use of single-nucleotide polymorphism maps in pharmacogenomics , 2000, Nature Biotechnology.

[45]  L. Jorde,et al.  A method for detecting recent selection in the human genome from allele age estimates. , 2003, Genetics.

[46]  M. Boehnke A look at linkage disequilibrium , 2000, Nature Genetics.

[47]  Bernice R. Packer,et al.  Widespread purifying selection at polymorphic sites in human protein-coding loci , 2003, Proceedings of the National Academy of Sciences of the United States of America.