Long-Short Term Memory Network for RNA Structure Profiling Super-Resolution

Profiling of RNAs improves understanding of cellular mechanisms, which can be essential to cure various diseases. It is estimated to take years to fully characterize the three-dimensional structure of around 200,000 RNAs in human using the mutate-and-map strategy. In order to speed up the profiling process, we propose a solution based on super-resolution. We applied five machine learning regression methods to perform RNA structure profiling super-resolution, i.e. to recover the whole data sets using self-similarity in low-resolution (undersampled) data sets. In particular, our novel Interaction Encoded Long-Short Term Memory (IELSTM) network can handle multiple distant interactions in the RNA sequences. When compared with ridge regression, LASSO regression, multilayer perceptron regression, and random forest regression, IELSTM network can reduce the mean squared error and the median absolute error by at least 33% and 31% respectively in three RNA structure profiling data sets.

[1]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[2]  Julius B. Lucks,et al.  An RNA Mapping DataBase for curating RNA structure mapping experiments , 2012, Bioinform..

[3]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Björn W. Schuller,et al.  Social signal classification using deep blstm recurrent neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[6]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Michal Irani,et al.  Super-resolution from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Cheemeng Tan Special collection of synthetic biology, aiming for quantitative control of cellular systems , 2017, Quantitative Biology.

[10]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[11]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[12]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[13]  Narendra Ahuja,et al.  Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[15]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[16]  Fei Deng,et al.  Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions , 2017, Quantitative Biology.

[17]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[18]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  C. V. D. Malsburg,et al.  Frank Rosenblatt: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms , 1986 .

[24]  Rhiju Das,et al.  A two-dimensional mutate-and-map strategy for non-coding RNA structure. , 2011, Nature chemistry.

[25]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).