Gene Copy-Number Polymorphism Caused by Retrotransposition in Humans

The era of whole-genome sequencing has revealed that gene copy-number changes caused by duplication and deletion events have important evolutionary, functional, and phenotypic consequences. Recent studies have therefore focused on revealing the extent of variation in copy-number within natural populations of humans and other species. These studies have found a large number of copy-number variants (CNVs) in humans, many of which have been shown to have clinical or evolutionary importance. For the most part, these studies have failed to detect an important class of gene copy-number polymorphism: gene duplications caused by retrotransposition, which result in a new intron-less copy of the parental gene being inserted into a random location in the genome. Here we describe a computational approach leveraging next-generation sequence data to detect gene copy-number variants caused by retrotransposition (retroCNVs), and we report the first genome-wide analysis of these variants in humans. We find that retroCNVs account for a substantial fraction of gene copy-number differences between any two individuals. Moreover, we show that these variants may often result in expressed chimeric transcripts, underscoring their potential for the evolution of novel gene functions. By locating the insertion sites of these duplicates, we are able to show that retroCNVs have had an important role in recent human adaptation, and we also uncover evidence that positive selection may currently be driving multiple retroCNVs toward fixation. Together these findings imply that retroCNVs are an especially important class of polymorphism, and that future studies of copy-number variation should search for these variants in order to illuminate their potential evolutionary and functional relevance.

[1]  P. Rogalla,et al.  Back to the roots of a new exon--the molecular archaeology of a SP100 splice variant. , 2000, Genomics.

[2]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[3]  M. Adams,et al.  Recent Segmental Duplications in the Human Genome , 2002, Science.

[4]  M. Rieder,et al.  Detection of structural variants and indels within exome data , 2011, Nature Methods.

[5]  S. Liebhaber,et al.  Pseudogene-mediated posttranscriptional silencing of HMGA1 can result in insulin resistance and type 2 diabetes. , 2010, Nature communications.

[6]  Santhosh Girirajan,et al.  Human copy number variation and complex genetic disease. , 2011, Annual review of genetics.

[7]  Richard M. Clark,et al.  Sequencing of natural strains of Arabidopsis thaliana with short reads. , 2008, Genome research.

[8]  P. Pandolfi,et al.  A coding-independent function of gene and pseudogene mRNAs regulates tumour biology , 2010, Nature.

[9]  D. Hartl,et al.  Chimeric genes as a source of rapid evolution in Drosophila melanogaster. , 2012, Molecular biology and evolution.

[10]  Gautier Koscielny,et al.  Ensembl 2012 , 2011, Nucleic Acids Res..

[11]  N. Vinckenbosch,et al.  Chromosomal Gene Movements Reflect the Recent Origin and Biology of Therian Sex Chromosomes , 2008, PLoS biology.

[12]  A. Parle‐McDermott,et al.  The former annotated human pseudogene dihydrofolate reductase-like 1 (DHFRL1) is expressed and functional , 2011, Proceedings of the National Academy of Sciences.

[13]  Joshua M. Korn,et al.  Integrated detection and population-genetic analysis of SNPs and copy number variation , 2008, Nature Genetics.

[14]  Jimmy Lin,et al.  Novel somatic mutations in heterotrimeric G proteins in melanoma , 2010, Cancer biology & therapy.

[15]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[16]  N. Vinckenbosch,et al.  Evolutionary fate of retroposed gene copies in the human genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[17]  A. Camargo,et al.  Sense-antisense pairs in mammals: functional and evolutionary considerations , 2007, Genome Biology.

[18]  R. Guigó,et al.  Transcriptome genetics using second generation sequencing in a Caucasian population , 2010, Nature.

[19]  Justin O. Borevitz,et al.  Natural Selection Shapes Genome-Wide Patterns of Copy-Number Polymorphism in Drosophila melanogaster , 2008, Science.

[20]  U. Pannicke,et al.  Dihydrofolate reductase deficiency due to a homozygous DHFR mutation causes megaloblastic anemia and cerebral folate deficiency leading to severe neurologic disease. , 2011, American journal of human genetics.

[21]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[22]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[23]  A. Reymond,et al.  Emergence of Young Human Genes after a Burst of Retroposition in Primates , 2005, PLoS biology.

[24]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[25]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[26]  Peter H. Sudmant,et al.  Evolution of Human-Specific Neural SRGAP2 Genes by Incomplete Segmental Duplication , 2012, Cell.

[27]  J. Nahon,et al.  Birth of Two Chimeric Genes in the Hominidae Lineage , 2001, Science.

[28]  Manuel A. S. Santos,et al.  Comparative genomics of wild type yeast strains unveils important genome diversity , 2008, BMC Genomics.

[29]  F. M. Huennekens,et al.  The methotrexate story: a paradigm for development of cancer chemotherapeutic agents. , 1994, Advances in enzyme regulation.

[30]  Jeffery P. Demuth,et al.  The Evolution of Mammalian Gene Families , 2006, PloS one.

[31]  Daniel R. Schrider,et al.  Lower linkage disequilibrium at CNVs is due to both recurrent mutation and transposing duplications. , 2010, Molecular biology and evolution.

[32]  Lucila Ohno-Machado,et al.  Distinct patterns of somatic alterations in a lymphoblastoid and a tumor genome derived from the same individual , 2011, Nucleic acids research.

[33]  F J Ayala,et al.  Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster. , 1994, Genetics.

[34]  E. Betrán,et al.  Comparative genomics reveals a constant rate of origination and convergent acquisition of functional retrogenes in Drosophila , 2007, Genome Biology.

[35]  L. Chasin,et al.  Isolation of Chinese hamster cell mutants deficient in dihydrofolate reductase activity. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Joshua M. Korn,et al.  Mapping and sequencing of structural variation from eight human genomes , 2008, Nature.

[37]  S. Mccarroll,et al.  Copy-number variation and association studies of human disease , 2007, Nature Genetics.

[38]  R. Hudson,et al.  On the divergence of alleles in nested subsamples from finite populations. , 1986, Genetics.

[39]  Omer Gokcumen,et al.  Exploring the role of copy number variants in human adaptation. , 2012, Trends in genetics : TIG.

[40]  S. O’Brien,et al.  Chromosomal localization and racial distribution of the polymorphic human dihydrofolate reductase pseudogene (DHFRP1). , 1988, American journal of human genetics.

[41]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[42]  Kenneth H. Wolfe,et al.  Turning a hobby into a job: How duplicated genes find new functions , 2008, Nature Reviews Genetics.

[43]  Zhaohui S. Qin,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007 .

[44]  Kristian Stevens,et al.  Genome-wide analysis of retrogene polymorphisms in Drosophila melanogaster. , 2011, Genome research.

[45]  Matthew W. Hahn,et al.  Distinguishing among evolutionary models for the maintenance of gene duplicates. , 2009, The Journal of heredity.

[46]  Corbin D. Jones,et al.  Parallel evolution of chimeric fusion genes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[47]  M. Long,et al.  Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. , 1993, Science.

[48]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[49]  Jennifer R. Moran,et al.  Adaptive loss of an old duplicated gene during incipient speciation. , 2006, Molecular biology and evolution.

[50]  N. Okada,et al.  mRNA retrotransposition coupled with 5' inversion as a possible source of new genes. , 2009, Molecular biology and evolution.

[51]  Daniel R. Schrider,et al.  Gene copy-number polymorphism in nature , 2010, Proceedings of the Royal Society B: Biological Sciences.

[52]  Carsten Bergmann,et al.  Mutation of the SBF2 gene, encoding a novel member of the myotubularin family, in Charcot-Marie-Tooth neuropathy type 4B2/11p15. , 2003, Human molecular genetics.

[53]  J. Pritchard,et al.  A Map of Recent Positive Selection in the Human Genome , 2006, PLoS biology.

[54]  Kenta Nakai,et al.  Retrotransposition as a Source of New Promoters , 2008, Molecular biology and evolution.

[55]  D. Haussler,et al.  Retrocopy contributions to the evolution of the human genome , 2008, BMC Genomics.

[56]  Kevin R. Thornton,et al.  Retroposed new genes out of the X in Drosophila. , 2002, Genome research.

[57]  S. L. Wong,et al.  Extensive Gene Traffic on the Mammalian X Chromosome , 2022 .

[58]  J. Brosius,et al.  Retroposons--seeds of evolution. , 1991, Science.

[59]  S. Lociuro,et al.  Dihydrofolate reductase inhibitors as antibacterial agents. , 2006, Biochemical pharmacology.

[60]  E. Nevo,et al.  Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[61]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[62]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[63]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[64]  Pardis C Sabeti,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007, Nature.

[65]  P. Stankiewicz,et al.  Structural variation in the human genome and its role in disease. , 2010, Annual review of medicine.

[66]  Beth Israel,et al.  Decision letter: Replication Study: A coding-independent function of gene and pseudogene mRNAs regulates tumour biology , 2010 .

[67]  Carlos Diaz-Castillo,et al.  Nuclear chromosome dynamics in the Drosophila male germ line contribute to the nonrandom genomic distribution of retrogenes. , 2012, Molecular biology and evolution.