Bias of Selection on Human Copy-Number Variants

Although large-scale copy-number variation is an important contributor to conspecific genomic diversity, whether these variants frequently contribute to human phenotype differences remains unknown. If they have few functional consequences, then copy-number variants (CNVs) might be expected both to be distributed uniformly throughout the human genome and to encode genes that are characteristic of the genome as a whole. We find that human CNVs are significantly overrepresented close to telomeres and centromeres and in simple tandem repeat sequences. Additionally, human CNVs were observed to be unusually enriched in those protein-coding genes that have experienced significantly elevated synonymous and nonsynonymous nucleotide substitution rates, estimated between single human and mouse orthologues. CNV genes encode disproportionately large numbers of secreted, olfactory, and immunity proteins, although they contain fewer than expected genes associated with Mendelian disease. Despite mouse CNVs also exhibiting a significant elevation in synonymous substitution rates, in most other respects they do not differ significantly from the genomic background. Nevertheless, they encode proteins that are depleted in olfactory function, and they exhibit significantly decreased amino acid sequence divergence. Natural selection appears to have acted discriminately among human CNV genes. The significant overabundance, within human CNVs, of genes associated with olfaction, immunity, protein secretion, and elevated coding sequence divergence, indicates that a subset may have been retained in the human population due to the adaptive benefit of increased gene dosage. By contrast, the functional characteristics of mouse CNVs either suggest that advantageous gene copies have been depleted during recent selective breeding of laboratory mouse strains or suggest that they were preferentially fixed as a consequence of the larger effective population size of wild mice. It thus appears that CNV differences among mouse strains do not provide an appropriate model for large-scale sequence variations in the human population.

[1]  J. Lupski Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits. , 1998, Trends in genetics : TIG.

[2]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[3]  C. Ponting,et al.  Finishing the euchromatic sequence of the human genome , 2004 .

[4]  Heather C. Mefford,et al.  The complex structure and dynamic evolution of human subtelomeres , 2002, Nature Reviews Genetics.

[5]  B. Frey,et al.  The functional landscape of mouse gene expression , 2004, Journal of biology.

[6]  Ronald W. Davis,et al.  Role of duplicate genes in genetic robustness against null mutations , 2003, Nature.

[7]  E. Eichler,et al.  Segmental duplications and copy-number variation in the human genome. , 2005, American journal of human genetics.

[8]  F. James Rohlf,et al.  Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[9]  M. Lercher,et al.  Explorer Evidence for Widespread Degradation of Gene Control Regions in Hominid Genomes , 2015 .

[10]  D. Lancet,et al.  Evidence for genetic determination in human twins of olfactory thresholds for a standard odorant , 1992, Neuroscience Letters.

[11]  Richard D Emes,et al.  Comparison of the genomes of human and mouse lays the foundation of genome zoology. , 2003, Human molecular genetics.

[12]  Michel Eichelbaum,et al.  Pharmacogenomics and individualized drug therapy. , 2006, Annual review of medicine.

[13]  Eugene V Koonin,et al.  A common framework for understanding the origin of genetic dominance and evolutionary fates of gene duplications. , 2004, Trends in genetics : TIG.

[14]  B. Hazes,et al.  Allelic Variation in the Ectodomain of the Inhibitory Ly-49G2 Receptor Alters Its Specificity for Allogeneic and Xenogeneic Ligands1 , 2002, The Journal of Immunology.

[15]  S. Otto,et al.  The evolution of gene duplicates. , 2002, Advances in genetics.

[16]  Colin N. Dewey,et al.  Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution , 2004, Nature.

[17]  E. Fisher,et al.  Human haploinsufficiency — one for sorrow, two for joy , 1994, Nature Genetics.

[18]  E. Eichler,et al.  Fine-scale structural variation of the human genome , 2005, Nature Genetics.

[19]  A. Eyre-Walker,et al.  Human disease genes: patterns and predictions. , 2003, Gene.

[20]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[21]  Tim Hubbard Finishing the euchromatic sequence of the human genome , 2004 .

[22]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[23]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[24]  R. Gibbs,et al.  Genomic segmental polymorphisms in inbred mouse strains , 2004, Nature Genetics.

[25]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[26]  D. Pinkel,et al.  Array comparative genomic hybridization and its applications in cancer , 2005, Nature Genetics.

[27]  S. Amladi,et al.  Online Mendelian Inheritance in Man 'OMIM'. , 2003, Indian journal of dermatology, venereology and leprology.

[28]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[29]  Z. Yang,et al.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. , 2000, Molecular biology and evolution.

[30]  Jean L. Chang,et al.  Initial sequence of the chimpanzee genome and comparison with the human genome , 2005, Nature.

[31]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[32]  International Human Genome Sequencing Consortium Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution , 2004 .

[33]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[34]  T. Jukes,et al.  The neutral theory of molecular evolution. , 2000, Genetics.

[35]  M. Adams,et al.  Recent Segmental Duplications in the Human Genome , 2002, Science.

[36]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[37]  Sewall Wright,et al.  Physiological and Evolutionary Theories of Dominance , 1934, The American Naturalist.

[38]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[39]  N. Takahata,et al.  Allelic genealogy and human evolution. , 1993, Molecular biology and evolution.

[40]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[41]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[42]  Wen-Hsiung Li,et al.  Molecular evolution meets the genomics revolution , 2003, Nature Genetics.

[43]  Caleb Webber,et al.  Hotspots of mutation and breakage in dog and human chromosomes. , 2005, Genome research.

[44]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[45]  Daniel R. Richards,et al.  Genetic diversity in yeast assessed with whole-genome oligonucleotide arrays. , 2003, Genetics.

[46]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[47]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[48]  T. Ohta Slightly Deleterious Mutant Substitutions in Evolution , 1973, Nature.

[49]  D. Gudbjartsson,et al.  A high-resolution recombination map of the human genome , 2002, Nature Genetics.

[50]  Gustavo Glusman,et al.  The complete human olfactory subgenome. , 2001, Genome research.

[51]  B. Rovin,et al.  The Influence of CCL 3 L 1 Gene – Containing Segmental Duplications on HIV-1 / AIDS Susceptibility , 2009 .

[52]  C. Wysocki,et al.  Ability to smell androstenone is genetically determined. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[53]  A. Kondrashov,et al.  Role of selection in fixation of gene duplications. , 2006, Journal of theoretical biology.

[54]  E. Koonin,et al.  Selection in the evolution of gene duplications , 2002, Genome Biology.

[55]  Liqing Zhang,et al.  Human SNPs reveal no evidence of frequent positive selection. , 2005, Molecular biology and evolution.

[56]  Leo Goodstadt,et al.  Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes , 2004, Genome Biology.

[57]  B. Trask,et al.  Segmental duplications: organization and impact within the current human genome project assembly. , 2001, Genome research.

[58]  N. Carter As normal as normal can be? , 2004, Nature Genetics.

[59]  Matthew Hurles,et al.  Are 100,000 "SNPs" Useless? , 2002, Science.

[60]  L. Hurst The Ka/Ks ratio: diagnosing the form of sequence evolution. , 2002, Trends in genetics : TIG.

[61]  J. Lupski,et al.  Molecular mechanisms for genomic disorders. , 2003, Annual review of genomics and human genetics.

[62]  A. Makrigiannis,et al.  Mapping of the BALB/c Ly49 cluster defines a minimal natural killer cell receptor gene repertoire. , 2004, Genomics.

[63]  David Haussler,et al.  Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. , 2003, Genome research.

[64]  Eric S. Lander,et al.  The mosaic structure of variation in the laboratory mouse genome , 2002, Nature.

[65]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.