Consensus Genetic Maps as Median Orders from Inconsistent Sources

A genetic map is an ordering of geneticmarkers calculated from a population of known lineage.While traditionally a map has been generated from a singlepopulation for each species, recently researchers have createdmaps from multiple populations. In the face of thesenew data, we address the need to find a consensus map — a map that combines the information from multiple partialand possibly inconsistent input maps. We model eachinput map as a partial order and formulate the consensusproblem as finding a median partial order. Finding themedian of multiple total orders (preferences or rankings)is a well studied problem in social choice. We choose tofind the median using the weighted symmetric differencedistance, a more general version of both the symmetricdifference distance and the Kemeny distance. Finding amedian order using this distance is NP-hard. We showthat for our chosen weight assignment, a median ordersatisfies the positive responsiveness, extended Condorcet,and unanimity criteria. Our solution involves finding themaximum acyclic subgraph of a weighted directed graph.We present a method that dynamically switches betweenan exact branch and bound algorithm and a heuristicalgorithm, and show that for real data from closely relatedorganisms, an exact median can often be found.We presentexperimental results using seven populations of the cropplant Zea mays.

[1]  Pavel A. Pevzner,et al.  Transforming men into mice (polynomial algorithm for genomic distance problem) , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[2]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[3]  Eviatar Nevo,et al.  Fast and high precision algorithms for optimization in large-scale genomic problems , 2004, Comput. Biol. Chem..

[4]  E Nevo,et al.  Constructing large-scale genetic maps using an evolutionary strategy algorithm. , 2003, Genetics.

[5]  Srinivas Aluru,et al.  Consensus genetic maps: a graph theoretic approach , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[6]  Nir Ailon,et al.  Aggregating inconsistent information: Ranking and clustering , 2008 .

[7]  Patrick S Schnable,et al.  Genetic Dissection of Intermated Recombinant Inbred Lines Using a New Genetic Map of Maize , 2006, Genetics.

[8]  Yong-Hsiang Hsieh,et al.  Optimal Algorithms for the Interval Location Problem with Range Constraints on Length and Average , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[9]  D. Grant,et al.  Expanding the genetic map of maize with the intermated B73 × Mo17 (IBM) population , 2002, Plant Molecular Biology.

[10]  J M Olson,et al.  Monte Carlo comparison of preliminary methods for ordering multiple genetic loci. , 1990, American journal of human genetics.

[11]  A. Levenglick,et al.  Fair and reasonable election systems , 1975 .

[12]  J. Ott Analysis of Human Genetic Linkage , 1985 .

[13]  Petr Slavík,et al.  A tight analysis of the greedy algorithm for set cover , 1996, STOC '96.

[14]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[15]  Thomas H Cormen Introduction to Algorithms and Java CD-ROM , 2003 .

[16]  Xuemin Lin,et al.  A Fast and Effective Heuristic for the Feedback Arc Set Problem , 1993, Inf. Process. Lett..

[17]  David Sankoff,et al.  Reversals of Fortune , 2005, Comparative Genomics.

[18]  Joseph Naor,et al.  Approximating Minimum Feedback Sets and Multicuts in Directed Graphs , 1998, Algorithmica.

[19]  M. Daly,et al.  MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. , 1987, Genomics.

[20]  Giuseppe F. Italiano,et al.  Trade-offs for fully dynamic transitive closure on DAGs: breaking through the O(n2 barrier , 2005, JACM.

[21]  Stephen J. Garland,et al.  Algorithm 97: Shortest path , 1962, Commun. ACM.

[22]  Alfred V. Aho,et al.  The Transitive Reduction of a Directed Graph , 1972, SIAM J. Comput..

[23]  Václav Koubek,et al.  A Reduct-and-Closure Algorithm for Graphs , 1979, MFCS.

[24]  Jijun Tang,et al.  Reconstructing phylogenies from gene-content and gene-order data , 2007, Mathematics of Evolution and Phylogeny.

[25]  Donald B. Johnson,et al.  Finding All the Elementary Circuits of a Directed Graph , 1975, SIAM J. Comput..

[26]  Pavel A. Pevzner,et al.  Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals , 1999, J. ACM.

[27]  Jayant Kalagnanam,et al.  A Computational Study of the Kemeny Rule for Preference Aggregation , 2004, AAAI.

[28]  Yoshiko Wakabayashi The Complexity of Computing Medians of Relations , 1998 .

[29]  Michel Truchon,et al.  Aggregation of Rankings in Figure Skating , 2004 .

[30]  László Lovász,et al.  On the ratio of optimal integral and fractional covers , 1975, Discret. Math..

[31]  J. Kleinberg,et al.  A graph-theoretic approach to comparing and integrating genetic, physical and sequence-based maps. , 2003, Genetics.

[32]  Patrick S Schnable,et al.  Cis-effects on Meiotic Recombination Across Distinct a1-sh2 Intervals in a Common Zea Genetic Background Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. AY656756, AY656757, AY656758 and AY662984, AY662985, AY662986, AY662987. , 2005, Genetics.

[33]  Aravinda Chakravarti,et al.  Preliminary ordering of multiple linked loci using pairwise linkage data , 1992, Genetic epidemiology.

[34]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[35]  K. Arrow Social Choice and Individual Values , 1951 .

[36]  Thomas Schiex,et al.  Car Agene: Constructing and Joining Maximum Likelihood Genetic Maps* , 2022 .

[37]  S. Shapiro,et al.  Mathematics without Numbers , 1993 .