论文信息 - Optimal learning in artificial neural networks: A theoretical view - 字舞流文

Optimal learning in artificial neural networks: A theoretical view

Paolo Frasconi | Marco Gori | Marco Maggini | Monica Bianchini | P. Frasconi | M. Bianchini | M. Gori | Marco Maggini

[1] Aimo A. Törn,et al. Global Optimization , 1999, Science.

[2] Peter Tiño,et al. Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[3] X H Yu,et al. On the local minima free condition of backpropagation learning , 1995, IEEE Trans. Neural Networks.

[4] Paolo Frasconi,et al. Learning without local minima in radial basis function networks , 1995, IEEE Trans. Neural Networks.

[5] Giovanni Soda,et al. Unified Integration of Explicit Knowledge and Learning by Example in Recurrent Networks , 1995, IEEE Trans. Knowl. Data Eng..

[6] Paolo Frasconi,et al. Learning in multilayered networks used as autoassociators , 1995, IEEE Trans. Neural Networks.

[7] Marco Gori,et al. On the problem of local minima in recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[8] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[9] Robert Hecht-Nielsen,et al. On the Geometry of Feedforward Neural Network Error Surfaces , 1993, Neural Computation.

[10] Russell Reed,et al. Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[11] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[12] Paolo Frasconi,et al. Backpropagation for linearly-separable patterns: A detailed analysis , 1993, IEEE International Conference on Neural Networks.

[13] Tsu-Shuan Chang,et al. A universal neural net with guaranteed convergence to zero system error , 1992, IEEE Trans. Signal Process..

[14] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[15] Xiao-Hu Yu,et al. Can backpropagation error surface not have local minima , 1992, IEEE Trans. Neural Networks.

[16] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[17] Sandro Ridella,et al. Statistically controlled activation weight initialization (SCAWI) , 1992, IEEE Trans. Neural Networks.

[18] Yoshua Bengio,et al. Learning the dynamic nature of speech with back-propagation for sequences , 1992, Pattern Recognit. Lett..

[19] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[20] Giovanni Soda,et al. Local Feedback Multilayered Networks , 1992, Neural Computation.

[21] Myung Won Kim,et al. The effect of initial weights on premature saturation in back-propagation learning , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[22] D. Rumelhart,et al. Generalization by weight-elimination applied to currency exchange rate prediction , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[23] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[24] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[25] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[26] T. Kohonen. The self-organizing map , 1990, Neurocomputing.

[27] Bernard Widrow,et al. 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[28] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.

[29] John J. Shynk,et al. Performance surfaces of a single-layer perceptron , 1990, IEEE Trans. Neural Networks.

[30] Ehud D. Karnin,et al. A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.

[31] Stephen I. Gallant,et al. Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[32] Marcus Frean,et al. The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[33] Chuanyi Ji,et al. Generalizing Smoothness Constraints from Discrete Samples , 1990, Neural Computation.

[34] Michael I. Jordan,et al. Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..

[35] Piero Cosi,et al. Phonetically-based multi-layered neural networks for vowel classification , 1990, Speech Commun..

[36] E. K. Blum,et al. Approximation of Boolean Functions by Sigmoidal Networks: Part I: XOR and Other Two-Variable Functions , 1989, Neural Computation.

[37] Eduardo D. Sontag,et al. Backpropagation separates when perceptrons do , 1989, International 1989 Joint Conference on Neural Networks.

[38] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..

[39] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[40] J. Nadal,et al. Learning in feedforward layered networks: the tiling algorithm , 1989 .

[41] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[42] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[43] J. Slawny,et al. Back propagation fails to separate where perceptrons succeed , 1989 .

[44] Alexander H. Waibel,et al. Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[45] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[46] Esther Levin,et al. Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..

[47] D. R. Hush,et al. Improving the learning rate of back-propagation with the gradient reuse algorithm , 1988, IEEE 1988 International Conference on Neural Networks.

[48] D Zipser,et al. Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.

[49] J. Fodor,et al. Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[50] Geoffrey E. Hinton,et al. Learning sets of filters using back-propagation , 1987 .

[51] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[52] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[53] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[54] Marco Gori,et al. Optimal convergence of on-line backpropagation , 1996, IEEE Trans. Neural Networks.

[55] Alberto Tesi,et al. On the Problem of Local Minima in Backpropagation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[56] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[57] David E. Rumelhart,et al. BACK-PROPAGATION, WEIGHT-ELIMINATION AND TIME SERIES PREDICTION , 1991 .

[58] Yih-Fang Huang,et al. Bounds on the number of hidden neurons in multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[59] Pietro Burrascano,et al. A norm selection criterion for the generalized delta rule , 1991, IEEE Trans. Neural Networks.

[60] Hervé Bourlard,et al. Speech pattern discrimination and multilayer perceptrons , 1989 .

[61] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[62] Eduardo D. Sontag,et al. Backpropagation Can Give Rise to Spurious Local Minima Even for Networks without Hidden Layers , 1989, Complex Syst..

[63] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[64] R. Hecht-Nielsen,et al. Back propagation error surfaces can have local minima , 1989, International 1989 Joint Conference on Neural Networks.

[65] Yann LeCun,et al. Generalization and network design strategies , 1989 .

[66] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.

[67] Robert Hecht-Nielsen,et al. Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[68] Bernard Widrow,et al. Adaptive switching circuits , 1988 .

[69] Yves Chauvin,et al. A Back-Propagation Algorithm with Optimal Use of Hidden Units , 1988, NIPS.

[70] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[71] Eric B. Baum,et al. Supervised Learning of Probability Distributions by Neural Networks , 1987, NIPS.

[72] Y. L. Cun. Learning Process in an Asymmetric Threshold Network , 1986 .

[73] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[74] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .