Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
暂无分享,去创建一个
Warren B. Powell | Boris Defourny | Somayeh Moazeni | Belgacem Bouzaïene-Ayari | Warrren B Powell | S. Moazeni | Boris Defourny | Belgacem Bouzaïene-Ayari
[1] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[2] V. Torczon,et al. Direct search methods: then and now , 2000 .
[3] Andrew W. Moore,et al. Policy Search using Paired Comparisons , 2003, J. Mach. Learn. Res..
[4] Philip S. Thomas,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines , 2017, ArXiv.
[5] Dimitri P. Bertsekas,et al. Error Bounds for Approximations from Projected Linear Equations , 2010, Math. Oper. Res..
[6] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[7] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..
[8] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[9] Pablo A. Parrilo,et al. Optimality of Affine Policies in Multistage Robust Optimization , 2009, Math. Oper. Res..
[10] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .
[11] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[12] T. Coleman,et al. Reconstructing the Unknown Local Volatility Function , 1999 .
[13] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[14] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[15] Donald R. Jones,et al. A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..
[16] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[17] Thomas F. Coleman,et al. Smoothing and parametric rules for stochastic mean-CVaR optimal execution strategy , 2016, Ann. Oper. Res..
[18] Nikolaos V. Sahinidis,et al. Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..
[19] Warren B. Powell,et al. Tutorial on Stochastic Optimization in Energy—Part II: An Energy Storage Illustration , 2016, IEEE Transactions on Power Systems.
[20] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[21] Dimitris Bertsimas,et al. On the power and limitations of affine policies in two-stage adaptive optimization , 2012, Math. Program..
[22] Jan Peters,et al. Policy Search for Motor Primitives , 2009, Künstliche Intell..
[23] D. Duffie,et al. An Overview of Value at Risk , 1997 .
[24] Tamara G. Kolda,et al. Revisiting Asynchronous Parallel Pattern Search for Nonlinear Optimization , 2005, SIAM J. Optim..
[25] Shie Mannor,et al. The Cross Entropy Method for Fast Policy Search , 2003, ICML.
[26] Daniel Kuhn,et al. Primal and dual linear decision rules in stochastic and robust optimization , 2011, Math. Program..
[27] Frank Riedel,et al. Dynamic Coherent Risk Measures , 2003 .
[28] Charles Audet,et al. Analysis of Generalized Pattern Searches , 2000, SIAM J. Optim..
[29] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[30] Dan MacIsaac,et al. Sustainable Energy — Without the hot air , 2009 .
[31] Dimitri P. Bertsekas,et al. Basis function adaptation methods for cost approximation in MDP , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[32] Nicola Secomandi,et al. An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation , 2010, Oper. Res..
[33] Warrren B Powell,et al. Mean-Conditional Value-at-Risk Optimal Energy Storage Operation in the Presence of Transaction Costs , 2015, IEEE Transactions on Power Systems.
[34] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[35] Stephen F. Smith,et al. Managing Wind-Based Electricity Generation in the Presence of Storage and Transmission Capacity , 2018 .
[36] Warren B. Powell,et al. The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters using Gaussian Process Regression , 2011, SIAM J. Optim..
[37] L. Dixon,et al. Parallel algorithms for global optimization , 1993 .
[38] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[39] Warren B. Powell,et al. Optimal Energy Commitments with Storage and Intermittent Supply , 2011, Oper. Res..
[40] M. Dahleh,et al. Optimal Management and Sizing of Energy Storage Under Dynamic Pricing for the Efficient Integration of Renewable Energy , 2015, IEEE Transactions on Power Systems.
[41] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[42] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[43] V. Torczon,et al. Why Pattern Search Works , 1998 .
[44] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[45] Louis Wehenkel,et al. Risk-aware decision making and dynamic programming , 2008 .
[46] M. Matos,et al. Optimization of Pumped Storage Capacity in an Isolated Power System With Large Renewable Penetration , 2008, IEEE Transactions on Power Systems.
[47] Pablo A. Parrilo,et al. Optimality of affine policies in multi-stage robust optimization , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[48] T. Coleman,et al. Total risk minimization using Monte-Carlo simulations , 2005 .
[49] Darwin G. Caldwell,et al. Direct policy search reinforcement learning based on particle filtering , 2012, EWRL 2012.
[50] Warren B. Powell,et al. SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy , 2012, INFORMS J. Comput..
[51] O. SIAMJ.,et al. ON THE CONVERGENCE OF PATTERN SEARCH ALGORITHMS , 1997 .
[52] Christine A. Shoemaker,et al. Applying Experimental Design and Regression Splines to High-Dimensional Continuous-State Stochastic Dynamic Programming , 1999, Oper. Res..
[53] Mark Baker,et al. Nested parallelism for multi-core HPC systems using Java , 2009, J. Parallel Distributed Comput..
[54] Eiki Yamakawa,et al. A BLOCK-PARALLEL CONJUGATE GRADIENT METHOD FOR SEPARABLE QUADRATIC PROGRAMMING PROBLEMS^1 , 1996 .
[55] Robert Michael Lewis,et al. On the Local Convergence of Pattern Search , 2003, SIAM J. Optim..
[56] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[57] Gene H. Golub,et al. Scientific computing: an introduction with parallel computing , 1993 .
[58] Tamara G. Kolda,et al. Asynchronous Parallel Pattern Search for Nonlinear Optimization , 2001, SIAM J. Sci. Comput..
[59] R. Carmona,et al. Valuation of energy storage: an optimal switching approach , 2010 .
[60] Alexander Shapiro. Time consistency of dynamic risk measures , 2012, Oper. Res. Lett..
[61] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[62] Warren B. Powell,et al. A comparison of approximate dynamic programming techniques on benchmark energy storage problems: Does anything work? , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[63] Andrea Castelletti,et al. Curses, Tradeoffs, and Scalable Management: Advancing Evolutionary Multiobjective Direct Policy Search to Improve Water Reservoir Operations , 2016 .
[64] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[65] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[66] Ying Li,et al. Numerical Solution of Continuous-State Dynamic Programs Using Linear and Spline Interpolation , 1993, Oper. Res..
[67] Warren B. Powell,et al. Clearing the Jungle of Stochastic Optimization , 2014 .
[68] Alexandre Street,et al. Time consistency and risk averse dynamic decision models: Definition, interpretation and practical consequences , 2014, Eur. J. Oper. Res..
[69] Y. Censor,et al. Parallel Optimization: Theory, Algorithms, and Applications , 1997 .
[70] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[71] Andrzej Stachurski,et al. Parallel Optimization: Theory, Algorithms and Applications , 2000, Parallel Distributed Comput. Pract..
[72] A.M. Gonzalez,et al. Stochastic Joint Optimization of Wind Generation and Pumped-Storage Units in an Electricity Market , 2008, IEEE Transactions on Power Systems.