Robustness to Incorrect System Models in Stochastic Control

In stochastic control applications, typically only an ideal model (controlled transition kernel) is assumed and the control design is based on the given model, raising the problem of performance loss due to the mismatch between the assumed model and the actual model. Toward this end, we study continuity properties of discrete-time stochastic control problems with respect to system models (i.e., controlled transition kernels) and robustness of optimal control policies designed for incorrect models applied to the true system. We study both fully observed and partially observed setups under an infinite horizon discounted expected cost criterion. We show that continuity and robustness cannot be established under weak and setwise convergences of transition kernels in general, but that the expected induced cost is robust under total variation. By imposing further assumptions on the measurement models and on the kernel itself (such as continuous convergence), we show that the optimal cost can be made continuous under weak convergence of transition kernels as well. Using these continuity properties, we establish convergence results and error bounds due to mismatch that occurs by the application of a control policy which is designed for an incorrectly estimated system model to a true model, thus establishing positive and negative results on robustness.Compared to the existing literature, we obtain strictly refined robustness results that are applicable even when the incorrect models can be investigated under weak convergence and setwise convergence criteria (with respect to a true model), in addition to the total variation criteria. These entail positive implications on empirical learning in (data-driven) stochastic control since often system models are learned through empirical training data where typically weak convergence criterion applies but stronger convergence criteria do not.

[1]  Bernt Øksendal,et al.  Forward–Backward Stochastic Differential Games and Stochastic Control under Model Uncertainty , 2014, J. Optim. Theory Appl..

[2]  Henry Lam,et al.  Robust Sensitivity Analysis for Stochastic Systems , 2013, Math. Oper. Res..

[3]  Huifu Xu,et al.  Convergence Analysis for Distributionally Robust Optimization and Equilibrium Problems , 2016, Math. Oper. Res..

[4]  Ian R. Petersen,et al.  Robustness and risk-sensitive filtering , 2002, IEEE Trans. Autom. Control..

[5]  Daniel Kuhn,et al.  Robust Markov Decision Processes , 2013, Math. Oper. Res..

[6]  Alexander Veretennikov,et al.  On robustness of discrete time optimal filters , 2014, 1501.00190.

[7]  Ian R. Petersen,et al.  Minimax optimal control of stochastic uncertain systems with relative entropy constraints , 2000, IEEE Trans. Autom. Control..

[8]  Edilson Fernandes de Arruda,et al.  Optimal Approximation Schedules for a Class of Iterative Algorithms, With an Application to Multigrid Value Iteration , 2012, IEEE Transactions on Automatic Control.

[9]  Garud Iyengar,et al.  Robust Dynamic Programming , 2005, Math. Oper. Res..

[10]  Karthyek R. A. Murthy,et al.  Quantifying Distributional Model Risk Via Optimal Transport , 2016, Math. Oper. Res..

[11]  Shie Mannor,et al.  Distributionally Robust Markov Decision Processes , 2010, Math. Oper. Res..

[12]  Tristan Tomala,et al.  Entropy bounds on Bayesian learning , 2008 .

[13]  V. Borkar White-noise representations in stochastic realization theory , 1993 .

[14]  Daniel Kuhn,et al.  Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations , 2015, Mathematical Programming.

[15]  R. Bass,et al.  Review: P. Billingsley, Convergence of probability measures , 1971 .

[16]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[17]  O. Gaans Probability measures on metric spaces , 2022 .

[18]  Wolfgang J. Runggaldier,et al.  Connections between stochastic control and dynamic games , 1996, Math. Control. Signals Syst..

[19]  T. Broadbent Measure and Integral , 1957, Nature.

[20]  Hans-Joachim Langen,et al.  Convergence of Dynamic Programming Models , 1981, Math. Oper. Res..

[21]  J. Norris Appendix: probability and measure , 1997 .

[22]  Charalambos D. Charalambous,et al.  Dynamic Programming Subject to Total Variation Distance Ambiguity , 2014, SIAM J. Control. Optim..

[23]  Alfred Müller,et al.  How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities? , 1997, Math. Oper. Res..

[24]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[25]  Edilson Fernandes de Arruda,et al.  Accelerating the convergence of value iteration by using partial transition functions , 2013, Eur. J. Oper. Res..

[26]  Tamás Linder,et al.  On the Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces , 2015, Math. Oper. Res..

[27]  Michael Z. Zgurovsky,et al.  On Continuity of Transition Probabilities in Belief MDPs with General State and Action Spaces , 2019 .

[28]  Michael Z. Zgurovsky,et al.  Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities , 2014, Math. Oper. Res..

[29]  Tamer Basar,et al.  Markov-Nash equilibria in mean-field games with discounted cost , 2016, 2017 American Control Conference (ACC).

[31]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[32]  László Györfi,et al.  Nonparametric Estimation of Conditional Distributions , 2007, IEEE Transactions on Information Theory.

[33]  Kellen Petersen August Real Analysis , 2009 .

[34]  Nikolai Matni,et al.  On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.

[35]  Mi-Ching Tsai,et al.  Robust and Optimal Control , 2014 .

[36]  D. Blei Bayesian Nonparametrics I , 2016 .

[37]  V. Ugrinovskii Robust H∞ infinity control in the presence of stochastic uncertainty , 1998 .

[38]  Rhodes,et al.  Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .

[39]  Ian R. Petersen,et al.  Robust H1 Control of Uncertain Systems with Structured Uncertainty , 1998 .

[40]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[41]  Bruce Hajek,et al.  Random Processes for Engineers , 2015 .

[42]  Naci Saldi,et al.  Near Optimality of Quantized Policies in Stochastic Control Under Weak Continuity Conditions , 2014, 1410.6985.

[43]  Garud Iyengar,et al.  Ambiguous chance constrained problems and robust optimization , 2006, Math. Program..

[44]  P. Billingsley,et al.  Statistical Methods in Markov Chains , 1961 .

[45]  C. Villani Optimal Transport: Old and New , 2008 .

[46]  Luigi Chisci,et al.  Robust Stochastic Control Based on Imprecise Probabilities , 2011 .

[47]  J. Doyle,et al.  Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.

[48]  Naci Saldi,et al.  Weak Feller property of non-linear filters , 2018, Syst. Control. Lett..

[49]  Manfred SchÄl,et al.  Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal , 1975 .

[50]  Tamás Linder,et al.  Optimization and convergence of observation channels in stochastic control , 2010, Proceedings of the 2011 American Control Conference.

[51]  Laurent El Ghaoui,et al.  Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..

[52]  T. Sargent,et al.  Robust Control and Model Uncertainty , 2001 .

[53]  Serdar Yüksel,et al.  Robustness to incorrect priors in partially observed stochastic control , 2018, SIAM J. Control. Optim..

[54]  Dudley,et al.  Real Analysis and Probability: Measurability: Borel Isomorphism and Analytic Sets , 2002 .

[55]  Ian R. Petersen,et al.  Robust Properties of Risk-Sensitive Control , 2000, Math. Control. Signals Syst..

[56]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[57]  I. Gihman,et al.  Controlled Stochastic Processes , 1979 .

[58]  Tamer Basar,et al.  Markov-Nash Equilibria in Mean-Field Games with Discounted Cost , 2018, SIAM J. Control. Optim..