Inter-population Differences in Retrogene Loss and Expression in Humans

Gene retroposition leads to considerable genetic variation between individuals. Recent studies revealed the presence of at least 208 retroduplication variations (RDVs), a class of polymorphisms, in which a retrocopy is present or absent from individual genomes. Most of these RDVs resulted from recent retroduplications. In this study, we used the results of Phase 1 from the 1000 Genomes Project to investigate the variation in loss of ancestral (i.e. shared with other primates) retrocopies among different human populations. In addition, we examined retrocopy expression levels using RNA-Seq data derived from the Ilumina BodyMap project, as well as data from lymphoblastoid cell lines provided by the Geuvadis Consortium. We also developed a new approach to detect novel retrocopies absent from the reference human genome. We experimentally confirmed the existence of the detected retrocopies and determined their presence or absence in the human genomes of 17 different populations. Altogether, we were able to detect 193 RDVs; the majority resulted from retrocopy deletion. Most of these RDVs had not been previously reported. We experimentally confirmed the expression of 11 ancestral retrogenes that underwent deletion in certain individuals. The frequency of their deletion, with the exception of one retrogene, is very low. The expression, conservation and low rate of deletion of the remaining 10 retrocopies may suggest some functionality. Aside from the presence or absence of expressed retrocopies, we also searched for differences in retrocopy expression levels between populations, finding 9 retrogenes that undergo statistically significant differential expression.

[1]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[2]  K. Friend,et al.  A novel X-linked trichothiodystrophy associated with a nonsense mutation in RNF113A , 2015, Journal of Medical Genetics.

[3]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[4]  Izabela Makałowska,et al.  RetrogeneDB—A Database of Animal Retrogenes , 2014, Molecular biology and evolution.

[5]  Geoffrey J Faulkner,et al.  Diversity through duplication: Whole-genome sequencing reveals novel gene retrocopies in the human population , 2014, BioEssays : news and reviews in molecular, cellular and developmental biology.

[6]  Daniel R. Zerbino,et al.  Ensembl 2014 , 2013, Nucleic Acids Res..

[7]  M. Gerstein,et al.  Analysis of variable retroduplications in human populations suggests coupling of retrotransposition to cell division , 2013, Genome research.

[8]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[9]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[10]  Li Ding,et al.  Retrotransposition of gene transcripts leads to structural variation in mammalian genomes , 2013, Genome Biology.

[11]  Matthew W. Hahn,et al.  Gene Copy-Number Polymorphism Caused by Retrotransposition in Humans , 2013, PLoS genetics.

[12]  David W. Cheung,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.

[13]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[14]  Damian Szklarczyk,et al.  “Orphan” Retrogenes in the Human Genome , 2012, Molecular biology and evolution.

[15]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[16]  M. Frith,et al.  Adaptive seeds tame genomic sequence comparison. , 2011, Genome research.

[17]  I. Rogozin,et al.  Primate and Rodent Specific Intron Gains and the Origin of Retrogenes with Splice Variants , 2010, Molecular biology and evolution.

[18]  P. Pandolfi,et al.  A coding-independent function of gene and pseudogene mRNAs regulates tumour biology , 2010, Nature.

[19]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[20]  Ion I. Mandoiu,et al.  The Birth of New Genes by RNA- and DNA-Mediated Duplication during Mammalian Evolution , 2009, J. Comput. Biol..

[21]  Liqing Zhang,et al.  Burst of Young Retrogenes and Independent Retrogene Formation in Mammals , 2009, PloS one.

[22]  Chris F. Taylor,et al.  RDML: structured language and reporting guidelines for real-time quantitative PCR data , 2009, Nucleic acids research.

[23]  Benedict Paten,et al.  Sequence progressive alignment, a framework for practical large-scale probabilistic consistency alignment , 2009, Bioinform..

[24]  Eric S. Lander,et al.  Sequencing the nuclear genome of the extinct woolly mammoth , 2008, Nature.

[25]  M. Long,et al.  The evolution of courtship behaviors through the origination of a new gene in Drosophila , 2008, Proceedings of the National Academy of Sciences.

[26]  Y. Sakaki,et al.  Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes , 2008, Nature.

[27]  N. Vinckenbosch,et al.  Chromosomal Gene Movements Reflect the Recent Origin and Biology of Therian Sex Chromosomes , 2008, PLoS biology.

[28]  D. Hartl,et al.  Genome Organization and Gene Expression Shape the Transposable Element Distribution in the Drosophila melanogaster Euchromatin , 2007, PLoS genetics.

[29]  N. Vinckenbosch,et al.  Evolutionary fate of retroposed gene copies in the human genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[30]  A. Reymond,et al.  Emergence of Young Human Genes after a Burst of Retroposition in Primates , 2005, PLoS biology.

[31]  J. Luban,et al.  Cyclophilin A retrotransposition into TRIM5 explains owl monkey resistance to HIV-1 , 2004, Nature.

[32]  J. Brosius The Contribution of RNAs and Retroposition to Evolutionary Novelties , 2003, Genetica.

[33]  A. Moorman,et al.  Assumption-free analysis of quantitative real-time polymerase chain reaction (PCR) data , 2003, Neuroscience Letters.

[34]  G. Prendergast,et al.  Actin' up: RhoB in cancer and apoptosis , 2001, Nature Reviews Cancer.

[35]  K. Kleene A possible meiotic function of the peculiar patterns of gene expression in mammalian spermatogenic cells , 2001, Mechanisms of Development.

[36]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[37]  T. Heidmann,et al.  mRNA retroposition in human cells: processed pseudogene formation. , 1995, The EMBO journal.

[38]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[39]  Ira M. Hall,et al.  BEDTools: a flexible suite of utilities for comparing genomic features , 2010, Bioinform..

[40]  N. Vinckenbosch,et al.  RNA-based gene duplication: mechanistic and evolutionary insights , 2009, Nature Reviews Genetics.

[41]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[42]  S. L. Wong,et al.  Extensive Gene Traffic on the Mammalian X Chromosome , 2022 .