Robust Quadratic Programming for MDPs with uncertain observation noise
暂无分享,去创建一个
Zhinan Peng | Hongliang Guo | Jianmei Su | Hong Cheng | Zhinan Peng | Hongliang Guo | Hong Cheng | Jianmei Su
[1] Dianhui Wang,et al. Stochastic Configuration Networks: Fundamentals and Algorithms , 2017, IEEE Transactions on Cybernetics.
[2] Marek Petrik,et al. Robust Approximate Bilinear Programming for Value Function Approximation , 2011, J. Mach. Learn. Res..
[3] Kun Zhang,et al. Tracking control optimization scheme of continuous-time nonlinear system via online single network adaptive critic design method , 2017, Neurocomputing.
[4] Di Wu,et al. Reachability analysis of uncertain systems using bounded-parameter Markov decision processes , 2008, Artif. Intell..
[5] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[6] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[7] Patrick Jaillet,et al. Sampling Based Approaches for Minimizing Regret in Uncertain Markov Decision Processes (MDPs) , 2017, J. Artif. Intell. Res..
[8] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[9] Lorenz T. Biegler,et al. Iterated linear programming strategies for nonsmooth simulation: Continuous and mixed-integer approaches , 1992 .
[10] Vivek F. Farias,et al. Approximate Dynamic Programming via a Smoothed Linear Program , 2009, Oper. Res..
[11] Ambuj Tewari,et al. Bounded Parameter Markov Decision Processes with Average Reward Criterion , 2007, COLT.
[12] André da Motta Salles Barreto,et al. Practical Kernel-Based Reinforcement Learning , 2014, J. Mach. Learn. Res..
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Matthijs T. J. Spaan,et al. Accelerated Vector Pruning for Optimal POMDP Solvers , 2017, AAAI.
[15] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..
[16] Reid G. Simmons,et al. Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.
[17] G. Riano,et al. Linear Programming solvers for Markov Decision Processes , 2006, 2006 IEEE Systems and Information Engineering Design Symposium.
[18] Shie Mannor,et al. Distributionally Robust Markov Decision Processes , 2010, Math. Oper. Res..
[19] Yaodong Ni,et al. Policy iteration for bounded-parameter POMDPs , 2012, Soft Computing.
[20] Igor Chikalov,et al. An algorithm for reduct cardinality minimization , 2013, 2013 IEEE International Conference on Granular Computing (GrC).
[21] Etienne Perot,et al. Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.
[22] Ness B. Shroff,et al. Markov decision processes with uncertain transition rates: sensitivity and robust control , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..
[23] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[24] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[25] Kotaro Hirasawa,et al. Kernel-Based Least Squares Temporal Difference With Gradient Correction , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[26] Xin Xu,et al. Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.
[27] Wen Jin,et al. The improvements of BP neural network learning algorithm , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.
[28] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[29] Stephen P. Boyd,et al. Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.
[30] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[31] Haibo He,et al. Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.
[32] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.
[33] Jason Pazis,et al. Non-Parametric Approximate Linear Programming for MDPs , 2011, AAAI.
[34] Shie Mannor,et al. Scaling Up Robust MDPs using Function Approximation , 2014, ICML.
[35] Marek Petrik,et al. Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.
[36] Yang Gao,et al. Online Selective Kernel-Based Temporal Difference Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[37] Kun Zhang,et al. Robust Optimal Control Scheme for Unknown Constrained-Input Nonlinear Systems via a Plug-n-Play Event-Sampled Critic-Only Algorithm , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[38] Kun Zhang,et al. Adaptive Fuzzy Fault-Tolerant Tracking Control for Partially Unknown Systems With Actuator Faults via Integral Reinforcement Learning Method , 2019, IEEE Transactions on Fuzzy Systems.
[39] André da Motta Salles Barreto,et al. Reinforcement Learning using Kernel-Based Stochastic Factorization , 2011, NIPS.
[40] Milos Hauskrecht,et al. Partitioned Linear Programming Approximations for MDPs , 2008, UAI.
[41] Daniel Kuhn,et al. Robust Markov Decision Processes , 2013, Math. Oper. Res..
[42] Chaoxu Mu,et al. A novel neural optimal control framework with nonlinear dynamics: Closed-loop stability and simulation verification , 2017, Neurocomputing.
[43] José Carlos Príncipe,et al. Kernel Temporal Differences for Neural Decoding , 2015, Comput. Intell. Neurosci..
[44] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[45] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..
[46] B. Krogh,et al. State aggregation in Markov decision processes , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..
[47] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[48] Marc Toussaint,et al. Temporally extended features in model-based reinforcement learning with partial observability , 2016, Neurocomputing.
[49] Weihong Zhang,et al. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes , 2011, J. Artif. Intell. Res..
[50] Yang Gao,et al. Efficient Average Reward Reinforcement Learning Using Constant Shifting Values , 2016, AAAI.
[51] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[52] Charles E. Blair,et al. Computational Difficulties of Bilevel Linear Programming , 1990, Oper. Res..
[53] Alexander Zadorojniy,et al. Robustness of policies in constrained Markov decision processes , 2006, IEEE Transactions on Automatic Control.
[54] Onésimo Hernández-Lerma,et al. The Linear Programming Approach , 2002 .
[55] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[56] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[57] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[58] Masashi Sugiyama,et al. Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation , 2010, IEICE Trans. Inf. Syst..