Learning to Forget: Continual Prediction with LSTM
暂无分享,去创建一个
Jürgen Schmidhuber | Fred A. Cummins | Felix A. Gers | J. Schmidhuber | F. Gers | Fred Cummins | Felix Alexander Gers
[1] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[2] Geoffrey E. Hinton,et al. Learning distributed representations of concepts. , 1989 .
[3] Jürgen Schmidhuber,et al. A local learning algorithm for dynamic feedforward and recurrent networks , 1990, Forschungsberichte, TU Munich.
[4] Michael C. Mozer,et al. A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..
[5] David Zipser,et al. Learning Sequential Structure with the Real-Time Recurrent Learning Algorithm , 1991, Int. J. Neural Syst..
[6] James L. McClelland,et al. Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.
[7] Alexander H. Waibel,et al. Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.
[8] Kenji Doya,et al. Adaptive neural oscillator using continuous-time back-propagation learning , 1989, Neural Networks.
[9] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[10] Michael C. Mozer,et al. Connectionist Music Composition Based on Melodic and Stylistic Constraints , 1990, NIPS.
[11] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[12] Scott E. Fahlman,et al. The Recurrent Cascade-Correlation Architecture , 1990, NIPS.
[13] Michael I. Jordan. Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .
[14] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[15] Alexander H. Waibel,et al. Multi-State Time Delay Networks for Continuous Speech Recognition , 1991, NIPS.
[16] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[17] Jürgen Schmidhuber,et al. A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.
[18] Ah Chung Tsoi,et al. Locally recurrent globally feedforward networks: a critical review of architectures , 1994, IEEE Trans. Neural Networks.
[19] Andreas S. Weigend,et al. Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .
[20] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[21] Eric Mjolsness,et al. A Mulitscale Attentional Framework for Relaxation Neural Networks , 1995, NIPS.
[22] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .
[23] Barak A. Pearlmutter. Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.
[24] Peter Tiño,et al. Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.
[25] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[26] Christian J. Darken,et al. Stochastic approximation and neural network learning , 1998 .
[27] Fred Cummins,et al. Automatic discrimination among languages based on prosody alone , 1999 .
[28] Jürgen Schmidhuber,et al. Language identification from prosody without explicit features , 1999, EUROSPEECH.
[29] Jonathan D. Cohen,et al. A Biologically Based Computational Model of Working Memory , 1999 .
[30] Jürgen Schmidhuber,et al. Learning to forget: continual prediction with LSTM , 1999 .
[31] Fred Cummins,et al. Learning to Forget: Continual Prediction with Lstm Learning to Forget: Continual Prediction with Lstm , 1999 .
[32] Gavin C. Cawley,et al. On a Fast, Compact Approximation of the Exponential Function , 2000, Neural Computation.
[33] Jürgen Schmidhuber,et al. LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.