Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory

We consider the problem of tracking an adversarial state sequence in a linear dynamical system subject to adversarial disturbances and loss functions, generalizing earlier settings in the literature. To this end, we develop three techniques, each of independent interest. First, we propose a comparator-adaptive algorithm for online linear optimization with movement cost. Without tuning, it nearly matches the performance of the optimally tuned gradient descent in hindsight. Next, considering a related problem called online learning with memory, we construct a novel strongly adaptive algorithm that uses our first contribution as a building block. Finally, we present the first reduction from adversarial tracking control to strongly adaptive online learning with memory. Summarizing these individual techniques, we obtain an adversarial tracking controller with a strong performance guarantee even when the reference trajectory has a large range of movement.

[1]  Yisong Yue,et al.  The Power of Predictions in Online Control , 2020, NeurIPS.

[2]  Francesco Orabona,et al.  Improved Strongly Adaptive Online Learning using Coin Betting , 2016, AISTATS.

[3]  Francesco Orabona,et al.  Coin Betting and Parameter-Free Online Learning , 2016, NIPS.

[4]  Seshadhri Comandur,et al.  Efficient learning algorithms for changing environments , 2009, ICML '09.

[5]  Kent Quanrud,et al.  Online Learning with Adversarial Delays , 2015, NIPS.

[6]  Non-stationary Online Learning with Memory and Non-stochastic Control , 2021, ArXiv.

[7]  Julian Zimmert,et al.  An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays , 2019, AISTATS.

[8]  Lachlan L. H. Andrew,et al.  Online Convex Optimization Using Predictions , 2015, SIGMETRICS.

[9]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[10]  Tomer Koren,et al.  Lazy OCO: Online Convex Optimization on a Switching Budget , 2021, COLT.

[11]  Elad Hazan,et al.  Adaptive Regret for Control of Time-Varying Dynamics , 2020, L4DC.

[12]  Wouter M. Koolen,et al.  Lipschitz and Comparator-Norm Adaptivity in Online Learning , 2020, COLT.

[13]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[14]  Francesco Orabona,et al.  Black-Box Reductions for Parameter-free Online Learning in Banach Spaces , 2018, COLT.

[15]  András György,et al.  Online Learning under Delayed Feedback , 2013, ICML.

[16]  Soon-Jo Chung,et al.  Online Optimization with Memory and Competitive Control , 2020, NeurIPS.

[17]  Adam Wierman,et al.  Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent , 2018, COLT.

[18]  Karthik Sridharan,et al.  Online Learning: Sufficient Statistics and the Burkholder Method , 2018, COLT.

[19]  Rina Panigrahy,et al.  Prediction strategies without loss , 2010, NIPS.

[20]  Rong Jin,et al.  Dynamic Regret of Strongly Adaptive Methods , 2017, ICML.

[21]  Francesco Orabona A Modern Introduction to Online Learning , 2019, ArXiv.

[22]  Na Li,et al.  Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis , 2019, NeurIPS.

[23]  Kush Bhatia,et al.  Online learning with dynamics: A minimax perspective , 2020, NeurIPS.

[24]  Elad Hazan,et al.  Online Control of Unknown Time-Varying Dynamical Systems , 2022, NeurIPS.

[25]  Lijun Zhang,et al.  Adaptive Regret of Convex and Smooth Functions , 2019, ICML.

[26]  Karan Singh,et al.  Logarithmic Regret for Online Control , 2019, NeurIPS.

[27]  Haipeng Luo,et al.  Online Gradient Boosting , 2015, NIPS.

[28]  Shie Mannor,et al.  Online Learning for Adversaries with Memory: Price of Past Mistakes , 2015, NIPS.

[29]  Jinfeng Yi,et al.  Improved Dynamic Regret for Non-degenerate Functions , 2016, NIPS.

[30]  Zhi-Hua Zhou,et al.  Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions , 2019, NeurIPS.

[31]  Tianbao Yang,et al.  Minimizing Dynamic Regret and Adaptive Regret Simultaneously , 2020, AISTATS.

[32]  Wouter M. Koolen,et al.  A Closer Look at Adaptive Regret , 2012, J. Mach. Learn. Res..

[33]  M. Krstić,et al.  Delay-Adaptive Linear Control , 2020 .

[34]  Nicolò Cesa-Bianchi,et al.  Online Learning with Switching Costs and Other Adaptive Adversaries , 2013, NIPS.

[35]  Dylan J. Foster,et al.  Logarithmic Regret for Adversarial Online Control , 2020, ICML.

[36]  Max Simchowitz,et al.  Improper Learning for Non-Stochastic Control , 2020, COLT.

[37]  Francesco Orabona,et al.  Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations , 2014, COLT.

[38]  Adam Wierman,et al.  Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization , 2019, NeurIPS.

[39]  Max Simchowitz,et al.  Making Non-Stochastic Control (Almost) as Easy as Stochastic , 2020, NeurIPS.

[40]  Sham M. Kakade,et al.  Online Control with Adversarial Disturbances , 2019, ICML.

[41]  Varun Kanade,et al.  Tracking Adversarial Targets , 2014, ICML.

[42]  Dirk van der Hoeven User-Specified Local Differential Privacy in Unconstrained Adaptive Online Learning , 2019, NeurIPS.

[43]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[44]  Yishay Mansour,et al.  Competitive ratio vs regret minimization: achieving the best of both worlds , 2019, ALT.

[45]  Shahin Shahrampour,et al.  Online Optimization : Competing with Dynamic Comparators , 2015, AISTATS.

[46]  Ashok Cutkosky,et al.  Parameter-free, Dynamic, and Strongly-Adaptive Online Learning , 2020, ICML.

[47]  Aditya Bhaskara,et al.  Power of Hints for Online Learning with Movement Costs , 2021, AISTATS.

[48]  Eyal Gofer Higher-Order Regret Bounds with Switching Costs , 2014, COLT.

[49]  Csaba Szepesvári,et al.  Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.

[50]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[51]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[52]  Amit Daniely,et al.  Strongly Adaptive Online Learning , 2015, ICML.

[53]  Nikolai Matni,et al.  On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.

[54]  Alessandro Astolfi Tracking and Regulation in Linear Systems , 2015, Encyclopedia of Systems and Control.

[55]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[56]  Daniel Limón,et al.  Tracking Model Predictive Control , 2019, Encyclopedia of Systems and Control.

[57]  Avinatan Hassidim,et al.  Online Linear Quadratic Control , 2018, ICML.