Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates

BackgroundAbundant pseudogenes are a feature of mammalian genomes. Processed pseudogenes (PPs) are reverse transcribed from mRNAs. Recent molecular biological studies show that mammalian long interspersed element 1 (L1)-encoded proteins may have been involved in PP reverse transcription. Here, we present the first comprehensive analysis of human PPs using all known human genes as queries.ResultsThe human genome was queried and 3,664 candidate PPs were identified. The most abundant were copies of genes encoding keratin 18, glyceraldehyde-3-phosphate dehydrogenase and ribosomal protein L21. A simple method was developed to estimate the level of nucleotide substitutions (and therefore the age) of PPs. A Poisson-like age distribution was obtained with a mean age close to that of the Alu repeats, the predominant human short interspersed elements. These data suggest a nearly simultaneous burst of PP and Alu formation in the genomes of ancestral primates. The peak period of amplification of these two distinct retrotransposons was estimated to be 40-50 million years ago. Concordant amplification of certain L1 subfamilies with PPs and Alus was observed.ConclusionsWe suggest that a burst of formation of PPs and Alus occurred in the genome of ancestral primates. One possible mechanism is that proteins encoded by members of particular L1 subfamilies acquired an enhanced ability to recognize cytosolic RNAs in trans.

[1]  Andrew W. Douglas Fundamentals of Molecular Evolution, 2nd Edition , 2000 .

[2]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[3]  D. Haussler,et al.  Assembly of the working draft of the human genome with GigAssembler. , 2001, Genome research.

[4]  Samuel Karlin,et al.  Genes, pseudogenes, and Alu sequence organization across human chromosomes 21 and 22 , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  N. Okada,et al.  LINEs Mobilize SINEs in the Eel through a Shared 3′ Sequence , 2002, Cell.

[6]  Thierry Heidmann,et al.  Human LINE retrotransposons generate processed pseudogenes , 2000, Nature Genetics.

[7]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[8]  Jef D Boeke,et al.  High Frequency Retrotransposition in Cultured Mammalian Cells , 1996, Cell.

[9]  L. Duret,et al.  Nature and structure of human genes that generate retropseudogenes. , 2000, Genome research.

[10]  J. Moggi-Cecchi,et al.  Humanity from African Naissance to Coming Millennia , 2001 .

[11]  A. Mighell,et al.  Vertebrate pseudogenes , 2000, FEBS letters.

[12]  C. Groves,et al.  Primate phylogeny: morphological vs. molecular results. , 1996, Molecular phylogenetics and evolution.

[13]  K. Holsinger The neutral theory of molecular evolution , 2004 .

[14]  A. Bird,et al.  The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[15]  J. Brosius,et al.  RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. , 1999, Gene.

[16]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[17]  H H Kazazian,et al.  HUGO—a midlife crisis? , 1998, Nature Genetics.

[18]  Xun Gu,et al.  Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution , 2002, Nature Genetics.

[19]  A. Smit,et al.  The origin of interspersed repeats in the human genome. , 1996, Current opinion in genetics & development.

[20]  R. Britten,et al.  Evidence that most human Alu sequences were inserted in a process that ceased about 30 million years ago. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[21]  K. Kuma,et al.  Extensive Gene Duplication in the Early Evolution of Animals Before the Parazoan–Eumetazoan Split Demonstrated by G Proteins and Protein Tyrosine Kinases from Sponge and Hydra , 1999, Journal of Molecular Evolution.

[22]  V. Kapitonov,et al.  The age of Alu subfamilies , 2004, Journal of Molecular Evolution.

[23]  M. Hattori,et al.  The DNA sequence of human chromosome 21 , 2000, Nature.

[24]  Thierry Heidmann,et al.  LINE-mediated retrotransposition of marked Alu sequences , 2003, Nature Genetics.

[25]  S. Boissinot,et al.  Adaptive evolution in LINE-1 retrotransposons. , 2001, Molecular biology and evolution.

[26]  D. Hartl,et al.  A maximum likelihood method for analyzing pseudogene evolution: implications for silent site evolution in humans and rodents. , 2002, Molecular biology and evolution.

[27]  J. Jurka,et al.  Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[28]  M. Nei,et al.  Positive Darwinian selection after gene duplication in primate ribonuclease genes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Jef D. Boeke,et al.  Human L1 Retrotransposition: cisPreference versus trans Complementation , 2001, Molecular and Cellular Biology.

[30]  Jan Paces,et al.  Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution. , 2002, Genome research.

[31]  Mark Gerstein,et al.  Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22. , 2002, Genome research.

[32]  M. Gerstein,et al.  Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. , 2002, Genome research.

[33]  N. Okada,et al.  SINEs: Short interspersed repeated elements of the eukaryotic genome. , 1991, Trends in ecology & evolution.

[34]  M. Long,et al.  Evolution of the phosphoglycerate mutase processed gene in human and chimpanzee revealing the origin of a new primate gene. , 2002, Molecular biology and evolution.

[35]  E. Vanin,et al.  Processed pseudogenes: characteristics and evolution. , 1985, Annual review of genetics.

[36]  B. Williams,et al.  Anthropoid Origins , 1997, Science.

[37]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[38]  M. Batzer,et al.  Alu repeats and human genomic diversity , 2002, Nature Reviews Genetics.

[39]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[40]  Shinsei Minoshima,et al.  Erratum: The DNA sequence of human chromosome 21: The chromosome 21 mapping and sequencing consortium (Nature (2000) 405 (311-319)) , 2000 .

[41]  N. Okada,et al.  SINEs and LINEs share common 3' sequences: a review. , 1997, Gene.

[42]  A. Weiner SINEs and LINEs: the art of biting the hand that feeds you. , 2002, Current opinion in cell biology.

[43]  N. Kenmochi,et al.  The human ribosomal protein genes: sequencing and comparative analysis of 73 genes. , 2002, Genome research.

[44]  Wen-Hsiung Li,et al.  Fundamentals of molecular evolution , 1990 .

[45]  A. Weiner,et al.  Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information. , 1986, Annual review of biochemistry.

[46]  Melanie E. Goward,et al.  The DNA sequence of human chromosome 22 , 1999, Nature.

[47]  T. Hayakawa,et al.  Inactivation of CMP-N-acetylneuraminic acid hydroxylase occurred prior to brain expansion during human evolution , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[48]  M. Hesse,et al.  Genes for intermediate filament proteins and the draft sequence of the human genome: novel keratin genes and a surprisingly high number of pseudogenes related to keratin genes 8 and 18. , 2001, Journal of cell science.

[49]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[50]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[51]  Haig H. Kazazian,et al.  An estimated frequency of endogenous insertional mutations in humans , 1999, Nature Genetics.

[52]  A. Smit,et al.  Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. , 1995, Journal of molecular biology.

[53]  S. Scherer,et al.  Germline and somatic mutations in the tyrosine kinase domain of the MET proto-oncogene in papillary renal carcinomas , 1997, Nature Genetics.

[54]  M. Nachman,et al.  Estimate of the mutation rate per nucleotide in humans. , 2000, Genetics.

[55]  C. Schmid,et al.  Does SINE evolution preclude Alu function? , 1998, Nucleic acids research.

[56]  R. Maraia,et al.  The decline in human Alu retroposition was accompanied by an asymmetric decrease in SRP9/14 binding to dimeric Alu RNA and increased expression of small cytoplasmic Alu RNA , 1997, Molecular and cellular biology.

[57]  R. Plasterk,et al.  Molecular Reconstruction of Sleeping Beauty , a Tc1-like Transposon from Fish, and Its Transposition in Human Cells , 1997, Cell.

[58]  T. Gojobori,et al.  A simple method for estimating the intensity of purifying selection in protein-coding genes. , 1999, Molecular biology and evolution.