Bellman’s GAP—a language and compiler for dynamic programming in sequence analysis

Motivation: Dynamic programming is ubiquitous in bioinformatics. Developing and implementing non-trivial dynamic programming algorithms is often error prone and tedious. Bellman’s GAP is a new programming system, designed to ease the development of bioinformatics tools based on the dynamic programming technique. Results: In Bellman’s GAP, dynamic programming algorithms are described in a declarative style by tree grammars, evaluation algebras and products formed thereof. This bypasses the design of explicit dynamic programming recurrences and yields programs that are free of subscript errors, modular and easy to modify. The declarative modules are compiled into C++ code that is competitive to carefully hand-crafted implementations. This article introduces the Bellman’s GAP system and its language, GAP-L. It then demonstrates the ease of development and the degree of re-use by creating variants of two common bioinformatics algorithms. Finally, it evaluates Bellman’s GAP as an implementation platform of ‘real-world’ bioinformatics tools. Availability: Bellman’s GAP is available under GPL license from http://bibiserv.cebitec.uni-bielefeld.de/bellmansgap. This Web site includes a repository of re-usable modules for RNA folding based on thermodynamics. Contact: robert@techfak.uni-bielefeld.de Supplementary information: Supplementary data are available at Bioinformatics online

[1]  Georg Sauthoff,et al.  Bellman's GAP: a 2nd generation language and system for algebraic dynamic programming , 2010 .

[2]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[3]  Robert Giegerich,et al.  RNAshapes: an integrated RNA analysis package based on abstract shapes. , 2006, Bioinformatics.

[4]  Robert Giegerich,et al.  Locomotif: from graphical motif description to RNA motif search , 2007, ISMB/ECCB.

[5]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[6]  Robert Giegerich,et al.  Alignment of Minisatellite Maps Based on Run-Length Encoding Scheme , 2009, J. Bioinform. Comput. Biol..

[7]  Peter Schneider-Kamp,et al.  Proceedings of the 13th international ACM SIGPLAN symposium on Principles and practices of declarative programming , 2011 .

[8]  Robert Giegerich,et al.  Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction , 2011, BMC Bioinformatics.

[9]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[10]  Robert Giegerich,et al.  Prediction of RNA Secondary Structure Including Kissing Hairpin Motifs , 2010, WABI.

[11]  Rolf Backofen,et al.  Abstract folding space analysis based on helices. , 2012, RNA.

[12]  R. Knight,et al.  Rapid denoising of pyrosequencing amplicon data: exploiting the rank-abundance distribution , 2010, Nature Methods.

[13]  S. Eddy,et al.  A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more. , 2012, RNA.

[14]  Paolo Bientinesi,et al.  A Domain-Specific Compiler for Linear Algebra Operations , 2012, VECPAR.

[15]  Kengo Nakajima,et al.  High Performance Computing for Computational Science - VECPAR 2012 , 2013, Lecture Notes in Computer Science.

[16]  Robert Giegerich,et al.  Challenges in the compilation of a domain specific language for dynamic programming , 2006, SAC '06.

[17]  Robert Giegerich,et al.  A discipline of dynamic programming over sequence data , 2004, Sci. Comput. Program..

[18]  Russell J. Davenport,et al.  Removing Noise From Pyrosequenced Amplicons , 2011, BMC Bioinformatics.

[19]  T. Speed,et al.  Biological Sequence Analysis , 1998 .

[20]  Robert Giegerich,et al.  Bellman's GAP: a declarative language for dynamic programming , 2011, PPDP.

[21]  Hisham M. Haddad Proceedings of the 2006 ACM symposium on Applied computing , 2006, SAC.

[22]  R. Giegerich,et al.  Conservation and Occurrence of Trans-Encoded sRNAs in the Rhizobiales , 2011, Genes.