Combinatoire and Bio-informatique : Comparaison de structures d'ARN et calcul de distances intergénomiques. (Combinatorics and Bioinformatic : RNA structure comparison and intergenomic distance computation)

Nous presentons un ensemble de resultats concernant deux types de problemes biologiques: (1) la comparaison de structures de molecules d'ARN et (2) le calcul de distances intergenomiques en presence de genes dupliques. Dans ce manuscrit, nous determinons la complexite algorithmique de certains problemes lies soit a la comparaison de structures de molecules d'ARN (distance d'edition, probleme APS, recherche de motifs de 2-intervalles, design d'ARN), soit aux rearrangements genomiques (distances de breakpoints et d'intervalles conserves). \\ L'approche adoptee pour l'ensemble de ces problemes a ete de determiner, si possible, des algorithmes exacts et rapides repondants aux problemes poses. Pour tout probleme pour lequel cela ne semblait pas possible, nous avons essaye de prouver qu'il ne peut etre resolu de fa\ccon rapide. Pour ce faire, nous demontrons que le probleme en question est algorithmiquement difficile. Enfin, le cas echeant, nous poursuivons l'etude de ce probleme en proposant, essentiellement, trois types de resultats: (1) Approximation, (2) Complexite parametree, (3) Heuristique. Nous utilisons, dans ce manuscrit, des notions d'optimisation combinatoire, de mathematique, de theorie des graphes et d'algorithmique.

[1]  G. Pruijn,et al.  Conserved features of Y RNAs: a comparison of experimentally derived secondary structures. , 2000, Nucleic acids research.

[2]  Stéphane Vialette,et al.  On the computational complexity of 2-interval pattern matching problems , 2004, Theor. Comput. Sci..

[3]  David Sankoff,et al.  Genome rearrangement with gene families , 1999, Bioinform..

[4]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[5]  M. Huynen,et al.  Automatic detection of conserved RNA structure elements in complete RNA virus genomes. , 1998, Nucleic acids research.

[6]  C. Pleij,et al.  Protonatable hairpins are conserved in the 5'-untranslated region of tymovirus RNAs. , 1996, Nucleic acids research.

[7]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[8]  S. Mitchell Linear algorithms to recognize outerplanar and maximal outerplanar graphs , 1979 .

[9]  Wolfram Saenger,et al.  Principles of Nucleic Acid Structure , 1983 .

[10]  Zhi-Zhong Chen,et al.  The Longest Common Subsequence Problem for Sequences with Nested Arc Annotations , 2001, ICALP.

[11]  Mihalis Yannakais,et al.  Embedding planar graphs in four pages , 1989, STOC 1989.

[12]  Vangelis Th. Paschos,et al.  Approximation polynomiale des problèmes NP-difficiles - Optima locaux et rapport différentiel , 2003 .

[13]  David Sankoff,et al.  Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement , 1995, Algorithmica.

[14]  Rolf Niedermeier,et al.  Pattern Matching for Arc-Annotated Sequences , 2002, FSTTCS.

[15]  Stéphane Vialette Pattern Matching over 2-intervals sets , 2002 .

[16]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[17]  Michael S. Waterman,et al.  Linear Trees and RNA Secondary Structure , 1994, Discret. Appl. Math..

[18]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[19]  Philip N. Klein,et al.  Computing the Edit-Distance between Unrooted Ordered Trees , 1998, ESA.

[20]  Bin Ma,et al.  The Longest Common Subsequence Problem for Arc-Annotated Sequences , 2000, CPM.

[21]  Maciej M. Syslo,et al.  Characterizations of outerplanar graphs , 1979, Discret. Math..

[22]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[23]  Takeaki Uno,et al.  Fast Algorithms to Enumerate All Common Intervals of Two Permutations , 1997, Algorithmica.

[24]  Marie-France Sagot,et al.  Sorting by Reversals in Subquadratic Time , 2004, CPM.

[25]  Silvio Micali,et al.  An O(v|v| c |E|) algoithm for finding maximum matching in general graphs , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[26]  V. Juan,et al.  Evidence for evolutionarily conserved secondary structure in the H19 tumor suppressor RNA. , 2000, Nucleic acids research.

[27]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[28]  C R Woese,et al.  Higher order structural elements in ribosomal RNAs: pseudo-knots and the use of noncanonical pairs. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Hurng-Yi Wang,et al.  Secondary structure of mitochondrial 12S rRNA among fish and its phylogenetic applications. , 2002, Molecular biology and evolution.

[30]  Daniel S. Hirschberg,et al.  The longest common subsequence problem. , 1975 .

[31]  Bin Ma,et al.  Edit distance between two RNA structures , 2001, RECOMB.

[32]  Y Van de Peer,et al.  Comparative analysis of more than 3000 sequences reveals the existence of two pseudoknots in area V4 of eukaryotic small subunit ribosomal RNA. , 2000, Nucleic acids research.