Interspersed repeats in the horse (Equus caballus); spatial correlations highlight conserved chromosomal domains.

The interspersed repeat content of mammalian genomes has been best characterized in human, mouse and cow. In this study, we carried out de novo identification of repeated elements in the equine genome and identified previously unknown elements present at low copy number. The equine genome contains typical eutherian mammal repeats, but also has a significant number of hybrid repeats in addition to clade-specific Long Interspersed Nuclear Elements (LINE). Equus caballus clade specific LINE 1 (L1) repeats can be classified into approximately five subfamilies, three of which have undergone significant expansion. There are 1115 full-length copies of these equine L1, but of the 103 presumptive active copies, 93 fall within a single subfamily, indicating a rapid recent expansion of this subfamily. We also analysed both interspersed and simple sequence repeats (SSR) genome-wide, finding that some repeat classes are spatially correlated with each other as well as with G+C content and gene density. Based on these spatial correlations, we have confirmed that recently-described ancestral vs. clade-specific genome territories can be defined by their repeat content. The clade-specific Short Interspersed Nuclear Element correlations were scattered over the genome and appear to have been extensively remodelled. In contrast, territories enriched for ancestral repeats tended to be contiguous domains. To determine if the latter territories were evolutionarily conserved, we compared these results with a similar analysis of the human genome, and observed similar ancestral repeat enriched domains. These results indicate that ancestral, evolutionarily conserved mammalian genome territories can be identified on the basis of repeat content alone. Interspersed repeats of different ages appear to be analogous to geologic strata, allowing identification of ancient vs. newly remodelled regions of mammalian genomes.

[1]  Gary Benson,et al.  Evolutionary History of Mammalian Transposons Determined by Genome-Wide Defragmentation , 2007, PLoS Comput. Biol..

[2]  David Haussler,et al.  Thousands of human mobile element fragments undergo strong purifying selection near developmental genes , 2007, Proceedings of the National Academy of Sciences.

[3]  Bin Ma,et al.  PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[4]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[5]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[6]  Arian F. A. Smit,et al.  MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation , 1995, Nucleic Acids Res..

[7]  T. Hughes,et al.  Most “Dark Matter” Transcripts Are Associated With Known Genes , 2010, PLoS biology.

[8]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[9]  J. N. MacLeod,et al.  Genome Sequence, Comparative Analysis, and Population Genetics of the Domestic Horse , 2009, Science.

[10]  Thierry Heidmann,et al.  LINE-mediated retrotransposition of marked Alu sequences , 2003, Nature Genetics.

[11]  E. Eichler,et al.  Mouse segmental duplication and copy number variation , 2008, Nature Genetics.

[12]  H. Kazazian Mobile Elements: Drivers of Genome Evolution , 2004, Science.

[13]  E. Eichler,et al.  An Alu transposition model for the origin and expansion of human segmental duplications. , 2003, American journal of human genetics.

[14]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[15]  J. Brosius,et al.  Functional persistence of exonized mammalian-wide interspersed repeat elements (MIRs). , 2007, Genome research.

[16]  D. Haussler,et al.  Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53 , 2007, Proceedings of the National Academy of Sciences.

[17]  K. Worley,et al.  The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and Evolution , 2009, Science.

[18]  James A. Cuff,et al.  Genome sequence, comparative analysis and haplotype structure of the domestic dog , 2005, Nature.

[19]  P. C. Gallagher,et al.  Two SINE families associated with equine microsatellite loci , 1999, Mammalian Genome.

[20]  M. Batzer,et al.  Alu repeats and human disease. , 1999, Molecular genetics and metabolism.

[21]  H. Kazazian,et al.  LINE-1 ORF1 Protein Localizes in Stress Granules with Other RNA-Binding Proteins, Including Components of RNA Interference RNA-Induced Silencing Complex , 2007, Molecular and Cellular Biology.

[22]  J. V. Moran,et al.  Mobile elements and mammalian genome evolution. , 2003, Current opinion in genetics & development.

[23]  J. Jurka,et al.  Repetitive sequences in complex genomes: structure and evolution. , 2007, Annual review of genomics and human genetics.

[24]  A. Smit,et al.  The origin of interspersed repeats in the human genome. , 1996, Current opinion in genetics & development.

[25]  J. Jurka,et al.  L1 repeat is a basic unit of heterochromatin satellites in cetaceans. , 1998, Molecular biology and evolution.

[26]  Eugene W. Myers,et al.  PILER: identification and classification of genomic repeats , 2005, ISMB.

[27]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[28]  H. Kazazian,et al.  Mobile elements and disease. , 1998, Current opinion in genetics & development.

[29]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[30]  Robert C. Edgar,et al.  Characterization and distribution of retrotransposons and simple sequence repeats in the bovine genome , 2009, Proceedings of the National Academy of Sciences.

[31]  Bronwen L. Aken,et al.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences , 2007, Nature.

[32]  David G. Harris,et al.  Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes. , 2003, Genetical research.

[33]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[34]  Anton Buzdin,et al.  The human genome contains many types of chimeric retrogenes generated through in vivo RNA recombination. , 2003, Nucleic acids research.

[35]  E. Ostertag,et al.  A novel testis ubiquitin-binding protein gene arose by exon shuffling in hominoids. , 2007, Genome research.

[36]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .