Training Bidirectional Recurrent Neural Network Architectures with the Scaled Conjugate Gradient Algorithm

Predictions on sequential data, when both the upstream and downstream information is important, is a difficult and challenging task. The Bidirectional Recurrent Neural Network (BRNN) architecture has been designed to deal with this class of problems. In this paper, we present the development and implementation of the Scaled Conjugate Gradient (SCG) learning algorithm for BRNN architectures. The model has been tested on the Protein Secondary Structure Prediction (PSSP) and Transmembrane Protein Topology Prediction problems (TMPTP). Our method currently achieves preliminary results close to 73 % correct predictions for the PSSP problem and close to 79 % for the TMPTP problem, which are expected to increase with larger datasets, external rules, ensemble methods and filtering techniques. Importantly, the SCG algorithm is training the BRNN architecture approximately 3 times faster than the Backpropagation Through Time (BPTT) algorithm.

[1]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[2]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[3]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[4]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[5]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Vasilis J. Promponas,et al.  Protein Secondary Structure Prediction with Bidirectional Recurrent Neural Nets: Can Weight Updating for Each Residue Enhance Performance? , 2010, AIAI.

[8]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[9]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[10]  Alessandro Sperduti,et al.  A general framework for adaptive processing of data structures , 1998, IEEE Trans. Neural Networks.

[11]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[12]  Vasilis J. Promponas,et al.  A Comparative Study on Filtering Protein Secondary Structure Prediction , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[14]  F. Richards,et al.  Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure * , 1988, Proteins.