The Potential of Family-Free Genome Comparison

Many methods in computational comparative genomics require gene family assignments as a prerequisite. While the biological concept of gene families is well established, their computational prediction remains unreliable. This paper continues a new line of research in which family assignments are not presumed. We study the potential of several family-free approaches in detecting conserved structures, genome rearrangements and in reconstructing ancestral gene orders.

[1]  Jens Stoye,et al.  Computation of Median Gene Clusters , 2009, J. Comput. Biol..

[2]  R. Doolittle Molecular evolution: computer analysis of protein and nucleic acid sequences. , 1990, Methods in enzymology.

[3]  Jens Stoye,et al.  Algorithms for Finding Gene Clusters , 2001, WABI.

[4]  João Meidanis,et al.  SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Jens Stoye,et al.  Character sets of strings , 2007, J. Discrete Algorithms.

[6]  W. Ewens,et al.  The chromosome inversion problem , 1982 .

[7]  Bernard M. E. Moret,et al.  Reversing Gene Erosion - Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data , 2004, WABI.

[8]  Sven Rahmann,et al.  Integer Linear Programs for Discovering Approximate Gene Clusters , 2006, WABI.

[9]  David Sankoff,et al.  Edit Distances for Genome Comparisons Based on Non-Local Operations , 1992, CPM.

[10]  Ján Manuch,et al.  Linearization of ancestral multichromosomal genomes , 2012, BMC Bioinformatics.

[11]  N. Perna,et al.  progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement , 2010, PloS one.

[12]  Fabian J Theis,et al.  PhenomiR: a knowledgebase for microRNA expression in diseases and biological processes , 2010, Genome Biology.

[13]  Guillaume Fertin,et al.  Combinatorics of Genome Rearrangements , 2009, Computational molecular biology.

[14]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[15]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[16]  Avi Pfeffer,et al.  Automatic genome-wide reconstruction of phylogenetic gene trees , 2007, ISMB/ECCB.

[17]  M. Middendorf,et al.  Solving the Preserving Reversal Median Problem , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Laurent Gueguen,et al.  Duplication, Rearrangement and Reconciliation: A Follow-Up 13 Years Later , 2013, Models and Algorithms for Genome Evolution.

[19]  Jens Stoye,et al.  Double Cut and Join with Insertions and Deletions , 2011, J. Comput. Biol..

[20]  Annie Chateau,et al.  Inferring Positional Homologs with Common Intervals of Sequences , 2006, Comparative Genomics.

[21]  David Sankoff,et al.  Genome rearrangement with gene families , 1999, Bioinform..

[22]  G. Blin,et al.  The breakpoint distance for signed sequences , 2005 .

[23]  D. Sankoff,et al.  Comparative Genomics: "Empirical And Analytical Approaches To Gene Order Dynamics, Map Alignment And The Evolution Of Gene Families" , 2000 .

[24]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[25]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[26]  Guillaume Fertin,et al.  Efficient Tools for Computing the Number of Breakpoints and the Number of Adjacencies between Two Genomes with Duplicate Genes , 2008, J. Comput. Biol..

[27]  Alberto Caprara The Reversal Median Problem , 2003, INFORMS J. Comput..

[28]  Takeaki Uno,et al.  Fast Algorithms to Enumerate All Common Intervals of Two Permutations , 1997, Algorithmica.

[29]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[30]  Jijun Tang,et al.  Phylogenetic reconstruction from arbitrary gene-order data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[31]  Nansheng Chen,et al.  Genome-Wide Comparative Gene Family Classification , 2010, PloS one.

[32]  Tao Jiang,et al.  MultiMSOAR 2.0: An Accurate Tool to Identify Ortholog Groups among Multiple Genomes , 2011, PloS one.

[33]  Xin He,et al.  Identifying Conserved Gene Clusters in the Presence of Homology Families , 2005, J. Comput. Biol..

[34]  Jens Stoye,et al.  A Unified Approach for Reconstructing Ancient Gene Clusters , 2009, TCBB.

[35]  Guillaume Fertin,et al.  Comparing Genomes with Duplications: A Computational Complexity Point of View , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[36]  Erik L. L. Sonnhammer,et al.  InParanoid 7: new algorithms and tools for eukaryotic orthology analysis , 2009, Nucleic Acids Res..

[37]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[38]  N. Friedman,et al.  Natural history and evolutionary principles of gene duplication in fungi , 2007, Nature.

[39]  Tao Jiang,et al.  MSOAR: A High-Throughput Ortholog Assignment System Based on Genome Rearrangement , 2007, J. Comput. Biol..

[40]  Evgeny M. Zdobnov,et al.  OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011 , 2010, Nucleic Acids Res..

[41]  Miklós Csuös,et al.  Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood , 2010, Bioinform..

[42]  Jens Stoye,et al.  On Sorting by Translocations , 2005, RECOMB.

[43]  D. Sankoff,et al.  Genomic divergence through gene rearrangement. , 1990, Methods in enzymology.

[44]  David Sankoff,et al.  Generalized Gene Adjacencies, Graph Bandwidth and Clusters in Yeast Evolution , 2008, ISBRA.

[45]  Mathieu Raffinot,et al.  The Algorithmic of Gene Teams , 2002, WABI.

[46]  Jens Stoye,et al.  On the Similarity of Sets of Permutations and Its Applications to Genome Comparison , 2006, J. Comput. Biol..

[47]  Colin N. Dewey Positional orthology: putting genomic evolutionary relationships into context , 2011, Briefings Bioinform..

[48]  Ron Shamir,et al.  The median problems for breakpoints are NP-complete , 1998, Electron. Colloquium Comput. Complex..

[49]  Cedric Chauve,et al.  Models and Algorithms for Genome Evolution , 2013, Computational Biology.

[50]  David Sankoff,et al.  The Median Problem for Breakpoints in Comparative Genomics , 1997, COCOON.

[51]  Daniel Doerr,et al.  Gene family assignment-free comparative genomics , 2012, BMC Bioinformatics.

[52]  Damian Szklarczyk,et al.  eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges , 2011, Nucleic Acids Res..

[53]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[54]  Eric Depiereux,et al.  2× genomes - depth does matter , 2010, Genome Biology.

[55]  Cédric Chauve,et al.  A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes , 2008, PLoS Comput. Biol..

[56]  Bernard M. E. Moret,et al.  GASTS: Parsimony Scoring under Rearrangements , 2011, WABI.

[57]  Hon Wai Leong,et al.  Identifying positional homologs as bidirectional best hits of sequence and gene context similarity , 2011, 2011 IEEE International Conference on Systems Biology (ISB).

[58]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[59]  D. Sankoff,et al.  Duplication, Rearrangement, and Reconciliation , 2000 .

[60]  Katharina Jahn Efficient Computation of Approximate Gene Clusters Based on Reference Occurrences , 2011, J. Comput. Biol..

[61]  Jens Stoye,et al.  Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences , 2004, CPM.

[62]  David Sankoff,et al.  Tests for gene clustering , 2002, RECOMB '02.

[63]  David Sankoff,et al.  Tests for Gene Clusters Satisfying the Generalized Adjacency Criterion , 2008, BSB.

[64]  Jian Ma,et al.  DUPCAR: Reconstructing Contiguous Ancestral Regions with Duplications , 2008, J. Comput. Biol..

[65]  Zhenyu Yang,et al.  Natural Parameter Values for Generalized Gene Adjacency , 2009, RECOMB-CG.

[66]  Jens Stoye,et al.  Common Intervals of Multiple Permutations , 2011, Algorithmica.

[67]  David Sankoff,et al.  Multiple Genome Rearrangement and Breakpoint Phylogeny , 1998, J. Comput. Biol..

[68]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[69]  Binhai Zhu,et al.  Approximability and Fixed-Parameter Tractability for the Exemplar Genomic Distance Problems , 2009, TAMC.