Gradient-based optimization of hyperparameters for base-pairing profile local alignment kernels.

We have recently proposed novel kernel functions, called base-pairing profile local alignment (BPLA) kernels for discrimination and detection of functional RNA sequences using SVMs. We employ STRAL's scoring function which takes into account sequence similarities as well as upstream and downstream base-pairing probabilities, which enables us to model secondary structures of RNA sequences. In this paper, we develop a method for optimizing hyperparameters of BPLA kernels with respect to discrimination accuracy using a gradient-based optimization technique. Our experiments show that the proposed method can find a nearly optimal set of parameters much faster than the grid search on all parameter combinations.

[1]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[2]  Kiyoshi Asai,et al.  Stem Kernels for RNA Sequence Analyses , 2007, BIRD.

[3]  Y. Sakakibara,et al.  Genome-wide searching with base-pairing kernel functions for noncoding RNAs: computational and expression analysis of snoRNA families in Caenorhabditis elegans , 2009, Nucleic acids research.

[4]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[5]  Deniz Dalli,et al.  StrAl: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time , 2006, Bioinform..

[6]  S. Sathiya Keerthi,et al.  An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models , 2006, NIPS.

[7]  Robert D. Finn,et al.  Rfam: updates to the RNA families database , 2008, Nucleic Acids Res..

[8]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[9]  R. Durbin,et al.  Biological sequence analysis: Background on probability , 1998 .

[10]  Kiyoshi Asai,et al.  Directed acyclic graph kernels for structural RNA analysis , 2008, BMC Bioinformatics.

[11]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[12]  Stephen P. Boyd,et al.  Semidefinite Programming , 1996, SIAM Rev..

[13]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[14]  P. Schuster,et al.  RNA multi-structure landscapes , 1993, European Biophysics Journal.

[15]  Sean R. Eddy,et al.  RSEARCH: Finding homologs of single structured RNA sequences , 2003, BMC Bioinformatics.

[16]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[17]  Tatsuya Akutsu,et al.  Protein homology detection using string alignment kernels , 2004, Bioinform..

[18]  Tatsuya Akutsu,et al.  Optimizing amino acid substitution matrices with a local alignment kernel , 2006, BMC Bioinformatics.

[19]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.