论文信息 - Stochastic Learning

Stochastic Learning

This contribution presents an overview of the theoretical and practical aspects of the broad family of learning algorithms based on Stochastic Gradient Descent, including Perceptrons, Adalines, K-Means, LVQ, Multi-Layer Networks, and Graph Transformer Networks.

Léon Bottou | L. Bottou

[1] Shun-ichi Amari,et al. A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[2] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[3] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[4] I︠a︡. Z. T︠S︡ypkin,et al. Foundations of the theory of learning systems , 1973 .

[5] Kumpati S. Narendra,et al. Adaptation and learning in automatic systems , 1974 .

[6] Vladimir Vapnik,et al. Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics) , 1982 .

[7] Lennart Ljung,et al. Theory and Practice of Recursive Identification , 1983 .

[8] John E. Dennis,et al. Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[9] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .

[10] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[11] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[12] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[13] Bernard Widrow,et al. Adaptive switching circuits , 1988 .

[14] Teuvo Kohonen,et al. Statistical pattern recognition with neural networks , 1988, Neural Networks.

[15] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .

[16] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[17] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[18] E. Capaldi,et al. The organization of behavior. , 1992, Journal of applied behavior analysis.

[19] Roberto Battiti,et al. First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[20] Isabelle Guyon,et al. Recognition-Based Segmentation of On-Line Hand-Printed Words , 1992, NIPS.

[21] John C. Platt,et al. Postal Address Block Location Using a Convolutional Locator Network , 1993, NIPS.

[22] G. Orr,et al. Momentum and optimal stochastic search , 1993 .

[23] Yoshua Bengio,et al. Convergence Properties of the K-Means Algorithms , 1994, NIPS.

[24] Anton Gunzinger,et al. Fast neural net simulation with a DSP processor array , 1995, IEEE Trans. Neural Networks.

[25] Yoshua Bengio,et al. LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition , 1995, Neural Computation.

[26] Shun-ichi Amari,et al. Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient , 1996, NIPS.

[27] Yoshua Bengio,et al. Global training of document processing systems using graph transformer networks , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29] Claudio Gentile,et al. Linear Hinge Loss and Average Margin , 1998, NIPS.

[30] Shun-ichi Amari,et al. Statistical analysis of learning dynamics , 1999, Signal Process..

[31] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[32] Nicol N. Schraudolph,et al. Conjugate Directions for Stochastic Gradient Descent , 2002, ICANN.

[33] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.

[34] Ji Zhu,et al. Margin Maximizing Loss Functions , 2003, NIPS.

[35] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[36] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[37] Léon Bottou,et al. On-line learning for very large data sets , 2005 .