Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

[1]  N. Archidiacono,et al.  Centromere emergence in evolution. , 2001, Genome research.

[2]  F. Azorín,et al.  Focus on the centre: the role of chromatin on the regulation of centromere identity and function , 2009, The EMBO journal.

[3]  D. Haussler,et al.  The structure and evolution of centromeric transition regions within the human genome , 2004, Nature.

[4]  H. Seuánez,et al.  Alpha satellite DNA in neotropical primates (Platyrrhini) , 1994, Chromosoma.

[5]  W. Earnshaw,et al.  The Centromere: Hub of Chromosomal Activities , 1995, Science.

[6]  Jonathan M. Mudge,et al.  Neocentromeres in 15q24-26 map to duplicons which flanked an ancestral centromere in 15q25. , 2003, Genome research.

[7]  H. Willard,et al.  Formation of de novo centromeres and construction of first-generation human artificial microchromosomes , 1997, Nature Genetics.

[8]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[9]  H. Willard,et al.  Human centromere structure: organization and potential role of alpha satellite DNA. , 1989, Progress in clinical and biological research.

[10]  Fengtang Yang,et al.  Refined genome-wide comparative map of the domestic horse, donkey and human based on cross-species chromosome painting: insight into the occasional fertility of mules , 2004, Chromosome Research.

[11]  E. Eichler,et al.  Independent centromere formation in a capricious, gene-free domain of chromosome 13q21 in Old World monkeys and pigs , 2006, Genome Biology.

[12]  J. Graves,et al.  Karyotype relationships between distantly related marsupials from South America and Australia , 2004, Chromosome Research.

[13]  R. Moyzis,et al.  Highly conserved repetitive DNA sequences are present at human centromeres. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Y. Yurov,et al.  The phylogeny of human chromosome specific alpha satellites , 2004, Chromosoma.

[15]  Bronwen L. Aken,et al.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences , 2007, Nature.

[16]  Huntington F Willard,et al.  Progressive proximal expansion of the primate X chromosome centromere. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Evan E. Eichler,et al.  An assessment of the sequence gaps: Unfinished business in a finished human genome , 2004, Nature Reviews Genetics.

[18]  J Gosden,et al.  Characterization of a chromosome-specific chimpanzee alpha satellite subset: evolutionary relationship to subsets on human chromosomes. , 1996, Genomics.

[19]  R. O’Neill,et al.  Species-specific shifts in centromere sequence composition are coincident with breakpoint reuse in karyotypically divergent lineages , 2007, Genome Biology.

[20]  E. Eichler,et al.  New insights into centromere organization and evolution from the white-cheeked gibbon and marmoset. , 2009, Molecular biology and evolution.

[21]  David L. Steffen,et al.  The DNA sequence of the human X chromosome , 2005, Nature.

[22]  H. Masumoto,et al.  CENP-B Interacts with CENP-C Domains Containing Mif2 Regions Responsible for Centromere Localization* , 2004, Journal of Biological Chemistry.

[23]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[24]  R. Díaz de la Guardia,et al.  Isolation of a species-specific satellite DNA with a novel CENP-B-like box from the North African rodent Lemniscomys barbarus. , 1999, Experimental cell research.

[25]  H. Masumoto,et al.  A human centromere protein, CENP-B, has a DNA binding domain containing four potential alpha helices at the NH2 terminus, which is separable from dimerizing activity , 1992, The Journal of cell biology.

[26]  Francesca Antonacci,et al.  Evolutionary Formation of New Centromeres in Macaque , 2007, Science.

[27]  H. Willard,et al.  Analysis of the centromeric regions of the human genome assembly. , 2004, Trends in genetics : TIG.

[28]  M. Ferguson-Smith,et al.  Human centromeric DNAs , 1997, Human Genetics.

[29]  Huntington F. Willard,et al.  Chromosome-specific subsets of human alpha satellite DNA: Analysis of sequence divergence within and between chromosomal subsets and evidence for an ancestral pentameric repeat , 2005, Journal of Molecular Evolution.

[30]  L. Burgoyne,et al.  Satellite DNA and higher-primate phylogeny. , 1989, Molecular biology and evolution.

[31]  H. Willard,et al.  Orangutan α-satellite monomers are closely related to the human consensus sequence , 1998, Mammalian Genome.

[32]  H. Willard,et al.  Orangutan alpha-satellite monomers are closely related to the human consensus sequence. , 1998, Mammalian genome : official journal of the International Mammalian Genome Society.

[33]  T. Graves,et al.  Characterizing the chromosomes of the platypus (Ornithorhynchus anatinus) , 2007, Chromosome Research.

[34]  S. Jackson,et al.  Rice (Oryza sativa) centromeric regions consist of complex DNA. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[35]  H. Willard,et al.  Concerted evolution of alpha satellite DNA: Evidence for species specificity and a general lack of sequence conservation among alphoid sequences of higher primates , 1989, Chromosoma.

[36]  W. Earnshaw,et al.  Centromeres: Old tales and new tools , 2008, FEBS letters.

[37]  Yi Zhang,et al.  Imprinting along the Kcnq1 domain on mouse chromosome 7 involves repressive histone methylation and recruitment of Polycomb group complexes , 2004, Nature Genetics.

[38]  Valery Shepelev,et al.  Alpha-satellite DNA of primates: old and new families , 2001, Chromosoma.

[39]  Süleyman Cenk Sahinalp,et al.  Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data , 2007, PLoS Comput. Biol..

[40]  D. Moazed,et al.  Centromere Assembly and Propagation , 2007, Cell.

[41]  D. Charlesworth,et al.  Centromere Locations and Associated Chromosome Rearrangements in Arabidopsis lyrata and A. thaliana , 2006, Genetics.

[42]  Eric D. Green,et al.  Confirming the Phylogeny of Mammals by Use of Large Comparative Sequence Data Sets , 2008, Molecular biology and evolution.

[43]  F. Blattner,et al.  Functional Rice Centromeres Are Marked by a Satellite Repeat and a Centromere-Specific Retrotransposon Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.003079. , 2002, The Plant Cell Online.

[44]  Eric D Green,et al.  Distinct retroelement classes define evolutionary breakpoints demarcating sites of evolutionary novelty , 2009, BMC Genomics.

[45]  K. Oegema,et al.  "Holo"er than thou: Chromosome segregation and kinetochore function in C. elegans , 2004, Chromosome Research.

[46]  G. Roizes Human centromeric alphoid domains are periodically homogenized so that they vary substantially between homologues. Mechanism and implications for centromere functioning , 2006, Nucleic acids research.

[47]  E. Winzeler,et al.  Genomic and Genetic Definition of a Functional Human Centromere , 2001, Science.

[48]  Gary H. Karpen,et al.  Determining centromere identity: cyclical stories and forking paths , 2001, Nature Reviews Genetics.

[49]  G. D. Valle,et al.  The C-Terminal Domain of CENP-C Displays Multiple and Critical Functions for Mammalian Centromere Formation , 2009, PloS one.

[50]  Serafim Batzoglou,et al.  MotifCut: regulatory motifs finding with maximum density subgraphs , 2006, ISMB.

[51]  H. Masumoto,et al.  Centromere protein B assembles human centromeric alpha-satellite DNA at the 17-bp sequence, CENP-B box , 1992, The Journal of cell biology.

[52]  K. Robertson,et al.  DNMT3B interacts with constitutive centromere protein CENP-C to modulate DNA methylation and the histone code at centromeric regions. , 2009, Human molecular genetics.

[53]  N. Archidiacono,et al.  Evolutionary movement of centromeres in horse, donkey, and zebra. , 2006, Genomics.

[54]  N. Carter,et al.  Reciprocal chromosome painting reveals detailed regions of conserved synteny between the karyotypes of the domestic dog (Canis familiaris) and human. , 1999, Genomics.

[55]  Huntington F. Willard,et al.  Hierarchical order in chromosome-specific human alpha satellite DNA , 1987 .

[56]  R. O’Neill,et al.  Cytogenetic and Molecular Evaluation of Centromere-Associated DNA Sequences From a Marsupial (Macropodidae: Macropus rufogriseus) X Chromosome , 2006, Genetics.