Online Optimization with Memory and Competitive Control

This paper presents competitive algorithms for a novel class of online optimization problems with memory. We consider a setting where the learner seeks to minimize the sum of a hitting cost and a switching cost that depends on the previous $p$ decisions. This setting generalizes Smoothed Online Convex Optimization. The proposed approach, Optimistic Regularized Online Balanced Descent, achieves a constant, dimension-free competitive ratio. Further, we show a connection between online optimization with memory and online control with adversarial disturbances. This connection, in turn, leads to a new constant-competitive policy for a rich class of online control problems.

[1]  Adam Wierman,et al.  An Online Algorithm for Smoothed Regression and LQR Control , 2018, AISTATS.

[2]  Lachlan L. H. Andrew,et al.  A tale of two metrics: simultaneous bounds on competitiveness and regret , 2013, SIGMETRICS '13.

[3]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[4]  Nikolai Matni,et al.  On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.

[5]  Nevena Lazic,et al.  Regret Bounds for Model-Free Linear Quadratic Control , 2018, ArXiv.

[6]  Nevena Lazic,et al.  Model-Free Linear Quadratic Control via Reduction to Expert Prediction , 2018, AISTATS.

[7]  Karan Singh,et al.  Logarithmic Regret for Online Control , 2019, NeurIPS.

[8]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .

[9]  Na Li,et al.  Online Optimization With Predictions and Switching Costs: Fast Algorithms and the Fundamental Limit , 2018, IEEE Transactions on Automatic Control.

[10]  Adam Wierman,et al.  Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent , 2018, COLT.

[11]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[12]  Lachlan L. H. Andrew,et al.  Online Convex Optimization Using Predictions , 2015, SIGMETRICS.

[13]  Na Li,et al.  Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis , 2019, NeurIPS.

[14]  Soon-Jo Chung,et al.  Neural-Swarm: Decentralized Close-Proximity Multirotor Control Using Learned Interactions , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Georgios B. Giannakis,et al.  An Online Convex Optimization Approach to Real-Time Energy Pricing for Demand Response , 2017, IEEE Transactions on Smart Grid.

[16]  Babak Hassibi,et al.  Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems , 2020, NeurIPS.

[17]  Na Li,et al.  Using Predictions in Online Optimization with Switching Costs: A Fast Algorithm and A Fundamental Limit , 2018, 2018 Annual American Control Conference (ACC).

[18]  Nikolai Matni,et al.  Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.

[19]  Adam Wierman,et al.  Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization , 2019, NeurIPS.

[20]  D. Luenberger Canonical forms for linear multivariable systems , 1967, IEEE Transactions on Automatic Control.

[21]  Avinatan Hassidim,et al.  Online Linear Quadratic Control , 2018, ICML.

[22]  Shie Mannor,et al.  Online Learning for Adversaries with Memory: Price of Past Mistakes , 2015, NIPS.

[23]  Lang Tong,et al.  iEMS for large scale charging of electric vehicles: Architecture and optimal online scheduling , 2012, 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm).

[24]  Yin Tat Lee,et al.  Competitively chasing convex bodies , 2018, STOC.

[25]  Lachlan L. H. Andrew,et al.  Online algorithms for geographical load balancing , 2012, 2012 International Green Computing Conference (IGCC).

[26]  Babak Hassibi,et al.  The Power of Linear Controllers in LQR Control , 2020, 2022 IEEE 61st Conference on Decision and Control (CDC).

[27]  Yin Tat Lee,et al.  Chasing Nested Convex Bodies Nearly Optimally , 2018, SODA.

[28]  Sham M. Kakade,et al.  Online Control with Adversarial Disturbances , 2019, ICML.

[29]  Adam Wierman,et al.  Online convex optimization with ramp constraints , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[30]  Anupam Gupta,et al.  Chasing Convex Bodies with Linear Competitive Ratio , 2019, SODA.

[31]  Peter J. Gawthrop,et al.  A nonlinear disturbance observer for robotic manipulators , 2000, IEEE Trans. Ind. Electron..

[32]  Alessandro Lazaric,et al.  Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems , 2018, ICML.

[33]  Naman Agarwal,et al.  Boosting for Dynamical Systems , 2019, ArXiv.

[34]  Adam Wierman,et al.  Online Optimization with Predictions and Non-convex Losses , 2020, Proc. ACM Meas. Anal. Comput. Syst..

[35]  Adam Wierman,et al.  Competitive Online Optimization under Inventory Constraints , 2019, Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems.

[36]  Adam Wierman,et al.  Online Inventory Management with Application to Energy Procurement in Data Centers , 2019, ArXiv.

[37]  Mark Sellke Chasing Convex Bodies Optimally , 2020, SODA.

[38]  Soon-Jo Chung,et al.  Neural Lander: Stable Drone Landing Control Using Learned Dynamics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[39]  Adam Wierman,et al.  Using Predictions in Online Optimization: Looking Forward with an Eye on the Past , 2016, SIGMETRICS.