Synchronized-TSP as a model for multilocus genetic consensus mapping

Numerous mapping projects conducted on different organisms have generated an abundance of mapping data. Consequently, many multilocus maps were constructed using diverse mapping populations and marker sets for the same species. The quality of maps varied broadly between populations, marker sets, and applied software. There might be some inconsistencies between different versions of the maps for the same organism, calling for the integration of mapping information and building of consensus maps. The problem of multilocus consensus genetic mapping (MCGM) is even more challenging, compared to multilocus mapping based on one data set, due to several complications: differences in recombination rate and distribution along chromosomes, and different subsets of markers used by different labs. We developed an approach to solve MCGM problems, by searching multilocus orders with the maximum number of shared markers yielding maps with minimum total length. The approach is based on re-analysis of raw data and is implemented in a two-phase algorithm. In Phase 1, for each data set, multilocus ordering is performed combined with iterative re-sampling to evaluate the stability of marker orders. In this phase, the ordering problem is reduced to the well known Traveling Salesperson Problem (TSP). In Phase 2, consensus mapping is conducted by reducing the problem to a specific version of TSP that can be referred to as synchronized TSP. The optimal consensus order of shared markers is defined by the minimal total length of non-conflicting maps of the chromosome. This criterion includes various modifications that take into account the variation in the quality of the original data (e.g., population size, marker quality, etc.). We use our powerful Guided Evolution Strategy algorithm for discrete optimization of constrained problems that was adapted to solve MCGM problems. The developed approach was tested on a wide range of simulated data.

[1]  E. Nevo,et al.  Efficient multipoint mapping: making use of dominant repulsion-phase markers , 2003, Theoretical and Applied Genetics.

[2]  Olli Bräysy,et al.  Active guided evolution strategies for large-scale vehicle routing problems with time windows , 2005, Comput. Oper. Res..

[3]  Michel Gendreau,et al.  METAHEURISTICS FOR THE VEHICLE ROUTING PROBLEM. , 1994 .

[4]  J E Mullet,et al.  A high-throughput AFLP-based method for constructing integrated genetic and physical maps: progress toward a sorghum genome map. , 2000, Genome research.

[5]  Z. Frenkel,et al.  Methods for Genetic Analysis in the Triticeae , 2009 .

[6]  Simon de Givry,et al.  CarthaGene : multipopulation integrated genetic and radiation hybrid mapping , 2005 .

[7]  D E Weeks,et al.  Preliminary ranking procedures for multilocus ordering. , 1987, Genomics.

[8]  J. Jansen,et al.  Constructing dense genetic linkage maps , 2001, Theoretical and Applied Genetics.

[9]  Eviatar Nevo,et al.  Fast and high precision algorithms for optimization in large-scale genomic problems , 2004, Comput. Biol. Chem..

[10]  Srinivas Aluru,et al.  Consensus Genetic Maps as Median Orders from Inconsistent Sources , 2008, TCBB.

[11]  Abraham B. Korol,et al.  Multilocus consensus genetic maps (MCGM): Formulation, algorithms, and results , 2006, Comput. Biol. Chem..

[12]  R. Agarwala,et al.  Second-generation integrated genetic linkage/radiation hybrid maps of the domestic cat (Felis catus). , 2003, The Journal of heredity.

[13]  Birger Raa,et al.  Discrete optimization for some TSP-like genome mapping problems , 2011 .

[14]  Ben Hui Liu,et al.  Statistical Genomics: Linkage, Mapping, and QTL Analysis , 1997 .

[15]  A B Korol,et al.  Molecular genetic maps in wild emmer wheat, Triticum dicoccoides: genome-wide coverage, massive negative interference, and putative quasi-linkage. , 2000, Genome research.

[16]  Robert A. Russell,et al.  Hybrid Heuristics for the Vehicle Routing Problem with Time Windows , 1995, Transp. Sci..

[17]  J. Kleinberg,et al.  A graph-theoretic approach to comparing and integrating genetic, physical and sequence-based maps. , 2003, Genetics.

[18]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[19]  Srinivas Aluru,et al.  A strategy for assembling the maize (Zea mays L.) genome , 2004, Bioinform..

[20]  E Nevo,et al.  Constructing large-scale genetic maps using an evolutionary strategy algorithm. , 2003, Genetics.