SA-REPC - Sequence Alignment with Regular Expression Path Constraint

In this paper, we define a novel variation on the constrained sequence alignment problem, the Sequence Alignment with Regular Expression Path Constraint problem, in which the constraint is given in the form of a regular expression. Our definition extends and generalizes the existing definitions of alignment-path constrained sequence alignments to the expressive power of regular expressions. We give a solution for the new variation of the problem and demonstrate its application to integrate microRNA-target interaction patterns into the target prediction computation. Our approach can serve as an efficient filter for more computationally demanding target prediction filtration algorithms. We compare our implementation for the SA-REPC problem, cAlign, to other microRNA target prediction algorithms.

[1]  K. Gunsalus,et al.  Combinatorial microRNA target predictions , 2005, Nature Genetics.

[2]  Chris Sander,et al.  Prediction of human microRNA targets. , 2006, Methods in molecular biology.

[3]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[4]  Peter F. Stadler,et al.  Partition function and base pairing probabilities of RNA heterodimers , 2006, Algorithms for Molecular Biology.

[5]  Yin-Te Tsai,et al.  Constrained multiple sequence alignment tool development and its application to RNase family alignment , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[6]  H. Lipkin Where is the ?c? , 1978 .

[7]  Minghui Jiang,et al.  uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts , 2008, BMC Bioinformatics.

[8]  F. Slack,et al.  Architecture of a validated microRNA::target interaction. , 2004, Chemistry & biology.

[9]  R. Russell,et al.  Principles of MicroRNA–Target Recognition , 2005, PLoS biology.

[10]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[11]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[12]  Gad M. Landau,et al.  A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices , 2003, SIAM J. Comput..

[13]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[14]  Peter F. Stadler,et al.  Translational Control by RNA-RNA Interaction: Improved Computation of RNA-RNA Binding Thermodynamics , 2008, BIRD.

[15]  Isaac Bentwich Prediction and validation of microRNAs and their targets , 2005, FEBS letters.

[16]  R. Durbin,et al.  Biological sequence analysis: Background on probability , 1998 .

[17]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[18]  Xiaowei Wang,et al.  Sequence analysis Prediction of both conserved and nonconserved microRNA targets in animals , 2007 .

[19]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[20]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[21]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[22]  R. Giegerich,et al.  Fast and effective prediction of microRNA/target duplexes. , 2004, RNA.

[23]  Eugene Berezikov,et al.  CONREAL web server: identification and visualization of conserved transcription factor binding sites , 2005, Nucleic Acids Res..

[24]  Chiara Gamberi,et al.  The C elegans hunchback homolog, hbl-1, controls temporal patterning and is a probable microRNA target. , 2003, Developmental cell.

[25]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[26]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[27]  Anton J. Enright,et al.  Prediction of microRNA targets. , 2007, Drug discovery today.

[28]  Julius Brennecke,et al.  Identification of Drosophila MicroRNA Targets , 2003, PLoS biology.

[29]  Tongbin Li,et al.  miRecords: an integrated resource for microRNA–target interactions , 2008, Nucleic Acids Res..

[30]  G. Kucherov,et al.  Multiseed lossless filtration , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[31]  Abdullah N. Arslan Regular expression constrained sequence alignment , 2007, J. Discrete Algorithms.

[32]  Glenn Otis Brown,et al.  Out of the Way , 2003, PLoS biology.

[33]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .

[34]  Yvonne Tay,et al.  A Pattern-Based Method for the Identification of MicroRNA Binding Sites and Their Corresponding Heteroduplexes , 2006, Cell.

[35]  Eugene W. Myers,et al.  Progressive multiple alignment with constraints , 1997, RECOMB '97.