论文信息 - Competing with Automata-based Expert Sequences

Competing with Automata-based Expert Sequences

We consider a general framework of online learning with expert advice where regret is defined with respect to sequences of experts accepted by a weighted automaton. Our framework covers several problems previously studied, including competing against k-shifting experts. We give a series of algorithms for this problem, including an automata-based algorithm extending weighted-majority and more efficient algorithms based on the notion of failure transitions. We further present efficient algorithms based on an approximation of the competitor automaton, in particular n-gram models obtained by minimizing the∞-Rényi divergence, and present an extensive study of the approximation properties of such models. Finally, we also extend our algorithms and results to the framework of sleeping experts.

Mehryar Mohri | Scott Yang | M. Mohri | Scott Yang

[1] Wouter M. Koolen,et al. A Closer Look at Adaptive Regret , 2012, J. Mach. Learn. Res..

[2] Chen-Yu Wei,et al. Tracking the Best Expert in Non-stationary Stochastic Environments , 2017, NIPS.

[3] Nicolò Cesa-Bianchi,et al. Mirror Descent Meets Fixed Share (and feels no regret) , 2012, NIPS.

[4] Wouter M. Koolen,et al. Universal Codes From Switching Strategies , 2013, IEEE Transactions on Information Theory.

[5] Shahin Shahrampour,et al. Distributed Online Optimization in Dynamic Environments Using Mirror Descent , 2016, IEEE Transactions on Automatic Control.

[6] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.

[7] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[8] Thomas Steinke,et al. Learning Hurdles for Sleeping Experts , 2014, ACM Trans. Comput. Theory.

[9] Tamás Linder,et al. Efficient Tracking of Large Classes of Experts , 2012, IEEE Trans. Inf. Theory.

[10] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[11] Vladimir Vovk,et al. Derandomizing Stochastic Prediction Strategies , 1997, COLT '97.

[12] András György,et al. Shifting Regret, Mirror Descent, and Matrices , 2016, ICML.

[13] Brian Roark,et al. Smoothed marginal distribution constraints for language modeling , 2013, ACL.

[14] Mehryar Mohri,et al. Weighted Automata Algorithms , 2009 .

[15] Peter Harremoës,et al. Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.

[16] Omar Besbes,et al. Non-Stationary Stochastic Optimization , 2013, Oper. Res..

[17] Mehryar Mohri,et al. A weight pushing algorithm for large vocabulary speech recognition , 2001, INTERSPEECH.

[18] Amit Daniely,et al. Strongly Adaptive Online Learning , 2015, ICML.

[19] Rebecca Willett,et al. Online Optimization in Dynamic Environments , 2013, ArXiv.

[20] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine-mediated learning.

[21] Robert D. Kleinberg,et al. Regret bounds for sleeping experts and bandits , 2010, Machine Learning.

[22] Aryan Mokhtari,et al. Optimization in Dynamic Environments : Improved Regret Rates for Strongly Convex Problems , 2016 .

[23] Mehryar Mohri,et al. On-Line Learning Algorithms for Path Experts with Non-Additive Losses , 2015, COLT.

[24] Tommi S. Jaakkola,et al. Online Learning of Non-stationary Sequences , 2003, NIPS.

[25] Seshadhri Comandur,et al. Efficient learning algorithms for changing environments , 2009, ICML '09.

[26] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[27] Omar Besbes,et al. Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-Stationary Rewards , 2014, Stochastic Systems.

[28] Varun Kanade,et al. Sleeping Experts and Bandits with Stochastic Action Availability and Adversarial Rewards , 2009, AISTATS.

[29] Manfred K. Warmuth,et al. Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..

[30] Brian Roark,et al. Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.

[31] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[32] Shahin Shahrampour,et al. Online Optimization : Competing with Dynamic Comparators , 2015, AISTATS.

[33] Mehryar Mohri,et al. Finite-State Transducers in Language and Speech Processing , 1997, CL.