Periodic step-size adaptation in second-order gradient descent for single-pass on-line structured learning
暂无分享,去创建一个
Yuh-Jye Lee | Chun-Nan Hsu | Yu-Ming Chang | Han-Shen Huang | Chun-Nan Hsu | Yu-Ming Chang | Han-Shen Huang | Yuh-Jye Lee
[1] J. Miller. Numerical Analysis , 1966, Nature.
[2] James M. Ortega,et al. Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.
[3] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[4] J. Douglas Faires,et al. Numerical Analysis , 1981 .
[5] J. Traub. Iterative Methods for the Solution of Equations , 1982 .
[6] Scott E. Fahlman,et al. An empirical study of learning speed in back-propagation networks , 1988 .
[7] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[8] Geoffrey E. Hinton,et al. Proceedings of the 1988 Connectionist Models Summer School , 1989 .
[9] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[10] Xiao-Li Meng,et al. On the global and componentwise rates of convergence of the EM algorithm , 1994 .
[11] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[12] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[13] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[14] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .
[15] C. Fraley. On Computing the Largest Fraction of Missing Information for the EM Algorithm and the Worst Linear F , 1998 .
[16] Noboru Murata,et al. A Statistical Study on On-line Learning , 1999 .
[17] Stanley F. Chen,et al. A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .
[18] David Saad,et al. On-Line Learning in Neural Networks , 1999 .
[19] Shun-ichi Amari,et al. Statistical analysis of learning dynamics , 1999, Signal Process..
[20] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[21] David A. Forsyth,et al. Shape, Contour and Grouping in Computer Vision , 1999, Lecture Notes in Computer Science.
[22] Yoshua Bengio,et al. Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.
[23] B. Schölkopf,et al. Advances in kernel methods: support vector learning , 1999 .
[24] David M. Rocke,et al. Some computational issues in cluster analysis with no a priori metric , 1999 .
[25] David E. Booth,et al. Analysis of Incomplete Multivariate Data , 2000, Technometrics.
[26] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[27] Rob Malouf,et al. A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.
[28] Nicol N. Schraudolph,et al. Conjugate Directions for Stochastic Gradient Descent , 2002, ICANN.
[29] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[30] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[31] Ruslan Salakhutdinov,et al. Adaptive Overrelaxed Bound Optimization Methods , 2003, ICML.
[32] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.
[33] Tim Hesterberg,et al. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.
[34] C. N Bouza,et al. Spall, J.C. Introduction to stochastic search and optimization. Estimation, simulation and control. Wiley Interscience Series in Discrete Mathematics and Optimization, 2003 , 2004 .
[35] Florentin Wörgötter,et al. Advances in Neural Information Processing Systems 16 (NIPS 2003) , 2004 .
[36] Burr Settles,et al. Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets , 2004, NLPBA/BioNLP.
[37] Chun-Nan Hsu,et al. Triple jump acceleration for the EM algorithm , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).
[38] Ben Taskar,et al. Learning structured prediction models: a large margin approach , 2005, ICML.
[39] Léon Bottou,et al. On-line learning for very large data sets , 2005 .
[40] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[41] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..
[42] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.
[43] Chun-Nan Hsu,et al. Global and Componentwise Extrapolation for Accelerating Data Mining from Large Incomplete Data Sets with the EM Algorithm , 2006, Sixth International Conference on Data Mining (ICDM'06).
[44] Mark W. Schmidt,et al. Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.
[45] Alexander J. Smola,et al. Step Size Adaptation in Reproducing Kernel Hilbert Space , 2006, J. Mach. Learn. Res..
[46] Jason Weston,et al. Solving multiclass support vector machines with LaRank , 2007, ICML '07.
[47] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[48] Jason Weston,et al. Large-scale kernel machines , 2007 .
[49] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[50] James C. Spall,et al. Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .
[51] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.
[52] Peter L. Bartlett,et al. Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks , 2008, J. Mach. Learn. Res..
[53] Chun-Nan Hsu,et al. Integrating high dimensional bi-directional parsing models for gene mention tagging , 2008, ISMB.
[54] Botond Cseke,et al. Advances in Neural Information Processing Systems 20 (NIPS 2007) , 2008 .
[55] Chun-Nan Hsu,et al. Global and componentwise extrapolations for accelerating training of Bayesian networks and conditional random fields , 2009, Data Mining and Knowledge Discovery.
[56] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[57] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..