Boosting for Dynamical Systems

We propose a framework of boosting for learning and control in environments that maintain a state. Leveraging methods for online learning with memory and for online boosting, we design an efficient online algorithm that can provably improve the accuracy of weak-learners in stateful environments. As a consequence, we give efficient boosting algorithms for both prediction and the control of dynamical systems. Empirical evaluation on simulated and real data for both control and prediction supports our theoretical findings.

[1]  Haipeng Luo,et al.  Optimal and Adaptive Algorithms for Online Boosting , 2015, ICML.

[2]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[3]  Csaba Szepesvári,et al.  Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.

[4]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[5]  Shie Mannor,et al.  Online Learning for Time Series Prediction , 2013, COLT.

[6]  Sham M. Kakade,et al.  Global Convergence of Policy Gradient Methods for Linearized Control Problems , 2018, ICML 2018.

[7]  Tengyu Ma,et al.  Gradient Descent Learns Linear Dynamical Systems , 2016, J. Mach. Learn. Res..

[8]  Avinatan Hassidim,et al.  Online Linear Quadratic Control , 2018, ICML.

[9]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[10]  Adam Tauman Kalai,et al.  Potential-Based Agnostic Boosting , 2009, NIPS.

[11]  Haipeng Luo,et al.  Online Gradient Boosting , 2015, NIPS.

[12]  Sanjeev Arora,et al.  Towards Provable Control for Unknown Linear Dynamical Systems , 2018, International Conference on Learning Representations.

[13]  Steven C. H. Hoi,et al.  Online ARIMA Algorithms for Time Series Prediction , 2016, AAAI.

[14]  Alex M. Andrew,et al.  Boosting: Foundations and Algorithms , 2012 .

[15]  Robert E. Schapire,et al.  Algorithms for portfolio management based on the Newton method , 2006, ICML.

[16]  Yi Zhang,et al.  Spectral Filtering for General Linear Dynamical Systems , 2018, NeurIPS.

[17]  Nikolai Matni,et al.  Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.

[18]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[20]  Shie Mannor,et al.  Online Learning for Adversaries with Memory: Price of Past Mistakes , 2015, NIPS.

[21]  Sham M. Kakade,et al.  Online Control with Adversarial Disturbances , 2019, ICML.

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  Nevena Lazic,et al.  Model-Free Linear Quadratic Control via Reduction to Expert Prediction , 2018, AISTATS.

[24]  Karan Singh,et al.  Learning Linear Dynamical Systems via Spectral Filtering , 2017, NIPS.