A Fast and Specific Alignment Method for Minisatellite Maps

Background Variable minisatellites count among the most polymorphic markers of eukaryotic and prokaryotic genomes. This variability can affect gene coding regions, like in the prion protein gene, or gene regulation regions, like for the cystatin B gene, and be associated or implicated in diseases: the Creutzfeld-Jakob disease and the myoclonus epilepsy type 1, for our examples. When it affects neutrally evolving regions, the polymorphism in length (i.e., in number of copies) of minisatellites proved useful in population genetics. Motivation In these tandem repeat sequences, different mutational mechanisms let the number of copies, as well as the copies themselves, vary. Especially, the interspersion of events of tandem duplication/contraction and of punctual mutation makes the succession of variant repeats much more informative than the sole allele length. To exploit this information requires the ability to align minisatellite alleles by accounting for both punctual mutations and tandem duplications. Results We propose a minisatellite maps alignment program that improves on previous solutions. Our new program is faster, simpler, considers an extended evolutionary model, and is available to the community. We test it on the data set of 609 alleles of the MSY1 (DYF155S1) human minisatellite and confirm its ability to recover known evolutionary signals. Our experiments highlight that the informativeness of minisatellites resides in their length and composition polymorphisms. Exploiting both simultaneously is critical to unravel the implications of variable minisatellites in the control of gene expression and diseases.

[1]  A. Carracedo,et al.  New method to measure minisatellite variant repeat variation in population genetic studies , 2002, American journal of human biology : the official journal of the Human Biology Council.

[2]  B. Olaisen,et al.  Mutation at minisatellite locus DYF155S1: Allele length mutation rate is affected by age of progenitor , 2002, Electrophoresis.

[3]  F. Denoeud,et al.  A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis , 2001, BMC Microbiology.

[4]  A. Jeffreys,et al.  Mutation rate heterogeneity and the generation of allele diversity at the human minisatellite MS205 (D16S309). , 1996, Human molecular genetics.

[5]  A. Jeffreys,et al.  Isolation and characterization of mouse minisatellites. , 1998, Genomics.

[6]  A. Jeffreys,et al.  Minisatellite repeat coding as a digital approach to DNA typing , 1991, Nature.

[7]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[8]  Eric Rivals,et al.  Comparison of Minisatellites , 2003, J. Comput. Biol..

[9]  Eric Rivals,et al.  A Survey On Algorithmic Aspects Of Tandem Repeats Evolution , 2004, Int. J. Found. Comput. Sci..

[10]  A. Jeffreys,et al.  Complex gene conversion events in germline mutation at human minisatellites , 1994, Nature Genetics.

[11]  Gary Benson,et al.  Reconstructing the Duplication History of a Tandem Repeat , 1999, ISMB.

[12]  A J Jeffreys,et al.  Allelic diversity at minisatellite MS205 (D16S309): evidence for polarized variability. , 1993, Human molecular genetics.

[13]  Jens Stoye,et al.  Alignment of Tandem Repeats with Excision, Duplication, Substitution and Indels (EDSI) , 2005, WABI.

[14]  A J Jeffreys,et al.  Allele diversity and germline mutation at the insulin minisatellite. , 2000, Human molecular genetics.

[15]  A. Jeffreys,et al.  Individual-specific ‘fingerprints’ of human DNA , 1985, Nature.

[16]  J. Armour,et al.  MS205 minisatellite diversity in Basques: evidence for a pre-Neolithic component. , 1998, Genome research.

[17]  M A Jobling,et al.  Hypervariable digital DNA codes for human paternal lineages: MVR-PCR at the Y-specific minisatellite, MSY1 (DYF155S1). , 1998, Human molecular genetics.

[18]  J. William Ahwood,et al.  CLASSIFICATION , 1931, Foundations of Familiar Language.

[19]  Eric Rivals,et al.  Formation of the Arabidopsis Pentatricopeptide Repeat Family1[W] , 2006, Plant Physiology.

[20]  M. Hurles,et al.  European Y-chromosomal lineages in Polynesians: a contrast to the population structure revealed by mtDNA. , 1998, American journal of human genetics.

[21]  Y E Dubrova,et al.  Extremely complex repeat shuffling during germline mutation at human minisatellite B6.7. , 1999, Human molecular genetics.

[22]  M. Tristem Molecular Evolution — A Phylogenetic Approach. , 2000, Heredity.

[23]  A. Jeffreys,et al.  Structural analysis of insulin minisatellite alleles reveals unusually large differences in diversity between Africans and non-Africans. , 2002, American journal of human genetics.

[24]  H. Zoghbi,et al.  Fourteen and counting: unraveling trinucleotide repeat diseases. , 2000, Human molecular genetics.

[25]  M. Hammer,et al.  Novel mutation processes in the evolution of a haploid minisatellite, MSY1: array homogenization without homogenization. , 1998, Human molecular genetics.

[26]  Jean-Marc Steyaert,et al.  An improved algorithm for generalized comparison of minisatellites , 2005, J. Discrete Algorithms.

[27]  A. Jeffreys,et al.  Big, bad minisatellites , 1997, Nature Genetics.

[28]  A. Jeffreys,et al.  Minisatellite instability and germline mutation , 1999, Cellular and Molecular Life Sciences CMLS.

[29]  C. Tyler-Smith,et al.  New uses for new haplotypes the human Y chromosome, disease and selection. , 2000, Trends in genetics : TIG.

[30]  G Vergnaud,et al.  Complex recombination events at the hypermutable minisatellite CEB1 (D2S90). , 1994, The EMBO journal.

[31]  A. Redd,et al.  A nomenclature system for the tree of human Y-chromosomal binary haplogroups. , 2002, Genome research.

[32]  Jean-Marc Steyaert,et al.  An Improved Algorithm for Generalized Comparison of Minisatellites , 2003, CPM.

[33]  Jean-Marc Steyaert,et al.  The Minisatellite Transformation Problem Revisited: A Run Length Encoded Approach , 2004, WABI.