Exponentiated Gradient Versus Gradient Descent for Linear Predictors
暂无分享,去创建一个
[1] H. Johnson,et al. A comparison of 'traditional' and multimedia information systems development practices , 2003, Inf. Softw. Technol..
[2] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[3] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.
[4] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[5] Geoffrey E. Hinton. Learning distributed representations of concepts. , 1989 .
[6] Nick Littlestone,et al. From on-line to batch learning , 1989, COLT '89.
[7] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.
[8] N. Littlestone. Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .
[9] Guy Jumarie,et al. Relative Information — What For? , 1990 .
[10] Nick Littlestone,et al. Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow , 1991, COLT '91.
[11] Philip M. Long,et al. On-line learning of linear functions , 1991, STOC '91.
[12] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[13] Manfred K. Warmuth,et al. Some weak learning results , 1992, COLT '92.
[14] R. Schapire. Toward Eecient Agnostic Learning , 1992 .
[15] J. N. Kapur,et al. Entropy optimization principles with applications , 1992 .
[16] Linda Sellie,et al. Toward efficient agnostic learning , 1992, COLT '92.
[17] David Haussler,et al. How to use expert advice , 1993, STOC.
[18] Manfred K. Warmuth,et al. Using experts for predicting continuous outcomes , 1994, European Conference on Computational Learning Theory.
[19] Philip M. Long,et al. Worst-case quadratic loss bounds for a generalization of the Widrow-Hoff rule , 1993, COLT '93.
[20] Philip M. Long,et al. WORST-CASE QUADRATIC LOSS BOUNDS FOR ON-LINE PREDICTION OF LINEAR FUNCTIONS BY GRADIENT DESCENT , 1993 .
[21] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .
[22] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[23] David Haussler,et al. Tight worst-case loss bounds for predicting with expert advice , 1994, EuroCOLT.
[24] Manfred K. Warmuth,et al. A comparison of new and old algorithms for a mixture estimation problem , 1995, COLT '95.
[25] Shun-ichi Amari,et al. Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.
[26] Manfred K. Warmuth,et al. On Weak Learning , 1995, J. Comput. Syst. Sci..
[27] Shun-ichi Amari,et al. The EM Algorithm and Information Geometry in Neural Network Learning , 1995, Neural Computation.
[28] Steve Rogers,et al. Adaptive Filter Theory , 1996 .
[29] Yoram Singer,et al. On‐Line Portfolio Selection Using Multiplicative Updates , 1998, ICML.