Competing with Automata-based Expert Sequences

We consider a general framework of online learning with expert advice where regret is defined with respect to sequences of experts accepted by a weighted automaton. Our framework covers several problems previously studied, including competing against k-shifting experts. We give a series of algorithms for this problem, including an automata-based algorithm extending weighted-majority and more efficient algorithms based on the notion of failure transitions. We further present efficient algorithms based on an approximation of the competitor automaton, in particular n-gram models obtained by minimizing the∞-Rényi divergence, and present an extensive study of the approximation properties of such models. Finally, we also extend our algorithms and results to the framework of sleeping experts.

[1]  Wouter M. Koolen,et al.  A Closer Look at Adaptive Regret , 2012, J. Mach. Learn. Res..

[2]  Chen-Yu Wei,et al.  Tracking the Best Expert in Non-stationary Stochastic Environments , 2017, NIPS.

[3]  Nicolò Cesa-Bianchi,et al.  Mirror Descent Meets Fixed Share (and feels no regret) , 2012, NIPS.

[4]  Wouter M. Koolen,et al.  Universal Codes From Switching Strategies , 2013, IEEE Transactions on Information Theory.

[5]  Shahin Shahrampour,et al.  Distributed Online Optimization in Dynamic Environments Using Mirror Descent , 2016, IEEE Transactions on Automatic Control.

[6]  Yoram Singer,et al.  Using and combining predictors that specialize , 1997, STOC '97.

[7]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[8]  Thomas Steinke,et al.  Learning Hurdles for Sleeping Experts , 2014, ACM Trans. Comput. Theory.

[9]  Tamás Linder,et al.  Efficient Tracking of Large Classes of Experts , 2012, IEEE Trans. Inf. Theory.

[10]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[11]  Vladimir Vovk,et al.  Derandomizing Stochastic Prediction Strategies , 1997, COLT '97.

[12]  András György,et al.  Shifting Regret, Mirror Descent, and Matrices , 2016, ICML.

[13]  Brian Roark,et al.  Smoothed marginal distribution constraints for language modeling , 2013, ACL.

[14]  Mehryar Mohri,et al.  Weighted Automata Algorithms , 2009 .

[15]  Peter Harremoës,et al.  Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.

[16]  Omar Besbes,et al.  Non-Stationary Stochastic Optimization , 2013, Oper. Res..

[17]  Mehryar Mohri,et al.  A weight pushing algorithm for large vocabulary speech recognition , 2001, INTERSPEECH.

[18]  Amit Daniely,et al.  Strongly Adaptive Online Learning , 2015, ICML.

[19]  Rebecca Willett,et al.  Online Optimization in Dynamic Environments , 2013, ArXiv.

[20]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine-mediated learning.

[21]  Robert D. Kleinberg,et al.  Regret bounds for sleeping experts and bandits , 2010, Machine Learning.

[22]  Aryan Mokhtari,et al.  Optimization in Dynamic Environments : Improved Regret Rates for Strongly Convex Problems , 2016 .

[23]  Mehryar Mohri,et al.  On-Line Learning Algorithms for Path Experts with Non-Additive Losses , 2015, COLT.

[24]  Tommi S. Jaakkola,et al.  Online Learning of Non-stationary Sequences , 2003, NIPS.

[25]  Seshadhri Comandur,et al.  Efficient learning algorithms for changing environments , 2009, ICML '09.

[26]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[27]  Omar Besbes,et al.  Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-Stationary Rewards , 2014, Stochastic Systems.

[28]  Varun Kanade,et al.  Sleeping Experts and Bandits with Stochastic Action Availability and Adversarial Rewards , 2009, AISTATS.

[29]  Manfred K. Warmuth,et al.  Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..

[30]  Brian Roark,et al.  Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.

[31]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[32]  Shahin Shahrampour,et al.  Online Optimization : Competing with Dynamic Comparators , 2015, AISTATS.

[33]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.