The general inefficiency of batch training for gradient descent learning
暂无分享,去创建一个
[1] Philipp Slusallek,et al. Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.
[2] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.
[3] Tony R. Martinez,et al. The need for small learning rates on large problems , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).
[4] Ali Zilouchian,et al. FUNDAMENTALS OF NEURAL NETWORKS , 2001 .
[5] Amir F. Atiya,et al. New results on recurrent network training: unifying the algorithms and accelerating convergence , 2000, IEEE Trans. Neural Networks Learn. Syst..
[6] Jose C. Principe,et al. Neural and adaptive systems , 2000 .
[7] J. Nazuno. Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .
[8] J. Spall. Stochastic Optimization, Stochastic Approximation and Simulated Annealing , 1999 .
[9] Enrico Gobbetti,et al. Encyclopedia of Electrical and Electronics Engineering , 1999 .
[10] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .
[11] M.H. Hassoun,et al. Fundamentals of Artificial Neural Networks , 1996, Proceedings of the IEEE.
[12] Yoshua Bengio,et al. Neural networks for speech and sequence recognition , 1996 .
[13] Laurene V. Fausett,et al. Fundamentals Of Neural Networks , 1994 .
[14] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .
[15] Martin Fodslette Møller,et al. A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.
[16] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[17] Philip D. Wasserman,et al. Advanced methods in neural computing , 1993, VNR computer library.
[18] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[19] Yoshua Bengio,et al. Artificial neural networks and their application to sequence recognition , 1991 .
[20] Paul Glasserman,et al. Gradient Estimation Via Perturbation Analysis , 1990 .
[21] Sholom M. Weiss,et al. Computer Systems That Learn , 1990 .
[22] Françoise Fogelman-Soulié,et al. Speaker-independent isolated digit recognition: Multilayer perceptrons vs. Dynamic time warping , 1990, Neural Networks.
[23] Geoffrey E. Hinton,et al. Proceedings of the 1988 Connectionist Models Summer School , 1989 .
[24] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[25] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[26] T. Kohonen. Self-organized formation of topographically correct feature maps , 1982 .
[27] Frank Rosenblatt,et al. PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .
[28] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .