暂无分享,去创建一个
[1] Christian Igel,et al. Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.
[2] Alexander J. Smola,et al. Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations , 2012, ICML.
[3] Anne Auger,et al. Mirrored Sampling and Sequential Selection for Evolution Strategies , 2010, PPSN.
[4] John L. Nazareth,et al. Introduction to derivative-free optimization , 2010, Math. Comput..
[5] Yi-Chi Wang,et al. Application of reinforcement learning for agent-based production scheduling , 2005, Eng. Appl. Artif. Intell..
[6] Jürgen Schmidhuber,et al. Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.
[7] Martin J. Wainwright,et al. Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.
[8] Kenneth O. Stanley,et al. Safe mutations for deep and recurrent neural networks through output gradients , 2017, GECCO.
[9] Luís Paulo Reis,et al. Model-Based Relative Entropy Stochastic Search , 2016, NIPS.
[10] Marius Lindauer,et al. An Evolution Strategy with Progressive Episode Lengths for Playing Games , 2019, IJCAI.
[11] Frank Hutter,et al. Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari , 2018, IJCAI.
[12] Kirthevasan Kandasamy,et al. High Dimensional Bayesian Optimisation and Bandits via Additive Models , 2015, ICML.
[13] C. Karen Liu,et al. Policy Transfer with Strategy Optimization , 2018, ICLR.
[14] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[15] Kenneth O. Stanley,et al. ES is more than just a traditional finite-difference approximator , 2017, GECCO.
[16] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[17] Xin Yao,et al. Turning High-Dimensional Optimization Into Computationally Expensive Optimization , 2016, IEEE Transactions on Evolutionary Computation.
[18] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[19] X. Yao. Evolving Artificial Neural Networks , 1999 .
[20] Shimon Whiteson,et al. Evolutionary Function Approximation for Reinforcement Learning , 2006, J. Mach. Learn. Res..
[21] James E. Baker,et al. Reducing Bias and Inefficienry in the Selection Algorithm , 1987, ICGA.
[22] Bernard Ghanem,et al. A Stochastic Derivative-Free Optimization Method with Importance Sampling: Theory and Learning to Control , 2019, AAAI.
[23] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[24] K. Doya,et al. Representation of Action-Specific Reward Values in the Striatum , 2005, Science.
[25] Alan Fern,et al. Using trajectory data to improve bayesian optimization for reinforcement learning , 2014, J. Mach. Learn. Res..
[26] Yuren Zhou,et al. A Restart-based Rank-1 Evolution Strategy for Reinforcement Learning , 2019, IJCAI.
[27] Kagan Tumer,et al. Evolution-Guided Policy Gradient in Reinforcement Learning , 2018, NeurIPS.
[28] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.
[29] J. Andrew Bagnell,et al. Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective , 2019, AISTATS.
[30] Tobias Glasmachers,et al. Challenges in High-dimensional Reinforcement Learning with Evolution Strategies , 2018, PPSN.
[31] Simon Lucey,et al. Learning Policies for Adaptive Tracking with Deep Feature Cascades , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Xin Yao,et al. Drift analysis and average time complexity of evolutionary algorithms , 2001, Artif. Intell..
[33] Andreas Krause,et al. Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[34] Risto Miikkulainen,et al. Designing neural networks through neuroevolution , 2019, Nat. Mach. Intell..
[35] András Lörincz,et al. Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.
[36] Anne Auger,et al. Evolution Strategies , 2018, Handbook of Computational Intelligence.
[37] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.
[38] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[39] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[40] Qingfu Zhang,et al. Fast Covariance Matrix Adaptation for Large-Scale Black-Box Optimization , 2020, IEEE Transactions on Cybernetics.
[41] Eytan Bakshy,et al. Bayesian Optimization for Policy Search via Online-Offline Experimentation , 2019, J. Mach. Learn. Res..
[42] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[43] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[44] Yang Yu,et al. Derivative-Free Optimization of High-Dimensional Non-Convex Functions by Sequential Random Embeddings , 2016, IJCAI.
[45] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.
[46] Jan Peters,et al. An experimental comparison of Bayesian optimization for bipedal locomotion , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[47] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[48] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[49] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[50] Michael I. Jordan,et al. Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.
[51] Yuting Zhang,et al. Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Bo An,et al. An extended study on multi-objective security games , 2012, Autonomous Agents and Multi-Agent Systems.
[53] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[54] Ohad Shamir,et al. Failures of Gradient-Based Deep Learning , 2017, ICML.
[55] Aimin Zhou,et al. Fuzzy-Classification Assisted Solution Preselection in Evolutionary Optimization , 2019, AAAI.
[56] Katya Scheinberg,et al. Introduction to derivative-free optimization , 2010, Math. Comput..
[57] Pedro Larrañaga,et al. Towards a New Evolutionary Computation - Advances in the Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.
[58] Yang Yu,et al. On Subset Selection with General Cost Constraints , 2017, IJCAI.
[59] Kenneth O. Stanley,et al. Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.
[60] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[61] Krzysztof Choromanski,et al. Variance Reduction for Evolution Strategies via Structured Control Variates , 2019, AISTATS.
[62] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[63] Pietro Lio',et al. Proximal Distilled Evolutionary Reinforcement Learning , 2019, AAAI.
[64] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[65] Thomas G. Dietterich. Machine-Learning Research Four Current Directions , 1997 .
[66] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[67] Stewart W. Wilson,et al. Noname manuscript No. (will be inserted by the editor) Learning Classifier Systems: A Survey , 2022 .
[68] Yu Maruyama,et al. Global Continuous Optimization with Error Bound and Fast Convergence , 2016, J. Artif. Intell. Res..
[69] Rémi Munos,et al. From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning , 2014, Found. Trends Mach. Learn..
[70] Adam D. Bull,et al. Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..
[71] Yang Yu,et al. The sampling-and-learning framework: A statistical view of evolutionary algorithms , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).
[72] Yang Yu,et al. Switch Analysis for Running Time Analysis of Evolutionary Algorithms , 2015, IEEE Transactions on Evolutionary Computation.
[73] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[74] Shimon Whiteson,et al. Comparing evolutionary and temporal difference methods in a reinforcement learning domain , 2006, GECCO.
[75] R. Bellman. A Markovian Decision Process , 1957 .
[76] Kenji Doya,et al. Online meta-learning by parallel algorithm competition , 2018, GECCO.
[77] Richard E. Turner,et al. Structured Evolution with Compact Architectures for Scalable Policy Optimization , 2018, ICML.
[78] Marco Wiering,et al. Reinforcement Learning , 2014, Adaptation, Learning, and Optimization.
[79] Yang Yu,et al. Towards Sample Efficient Reinforcement Learning , 2018, IJCAI.
[80] Marc Toussaint,et al. Bayesian Functional Optimization , 2018, AAAI.
[81] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[82] Michael J. Frank,et al. By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.
[83] Adam Gaier,et al. Weight Agnostic Neural Networks , 2019, NeurIPS.
[84] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[85] Jian Peng,et al. Policy Optimization by Genetic Distillation , 2017, ICLR.
[86] Risto Miikkulainen,et al. Evolving neural networks for strategic decision-making problems , 2009, Neural Networks.
[87] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[88] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.
[89] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.
[90] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[91] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[92] Yang Yu,et al. A new approach to estimating the expected first hitting time of evolutionary algorithms , 2006, Artif. Intell..
[93] Vianney Perchet,et al. Highly-Smooth Zero-th Order Online Optimization , 2016, COLT.
[94] Trevor Darrell,et al. Gradient-free Policy Architecture Search and Adaptation , 2017, CoRL.
[95] Kenneth O. Stanley,et al. A Case Study on the Critical Role of Geometric Regularity in Machine Learning , 2008, AAAI.
[96] Tamara G. Kolda,et al. Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods , 2003, SIAM Rev..
[97] Dorothea Heiss-Czedik,et al. An Introduction to Genetic Algorithms. , 1997, Artificial Life.
[98] Kevin Leyton-Brown,et al. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.
[99] Yang Yu,et al. Subset Selection by Pareto Optimization , 2015, NIPS.
[100] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[101] Christian Igel,et al. Evolution Strategies for Direct Policy Search , 2008, PPSN.
[102] D. R. McGregor,et al. Designing application-specific neural networks using the structured genetic algorithm , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.
[103] Kenneth O. Stanley,et al. On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent , 2017, ArXiv.
[104] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[105] Yang Yu,et al. Derivative-Free Optimization via Classification , 2016, AAAI.
[106] Adel Bibi,et al. A Stochastic Derivative Free Optimization Method with Momentum , 2019, ICLR.
[107] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[108] Jan Peters,et al. Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.
[109] Viet-Hung Dang,et al. A Covariance Matrix Adaptation Evolution Strategy for Direct Policy Search in Reproducing Kernel Hilbert Space , 2017, ACML.
[110] Andreas Krause,et al. Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features , 2018, NeurIPS.
[111] Krzysztof Choromanski,et al. From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization , 2019, NeurIPS.
[112] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[113] Nando de Freitas,et al. Bayesian Optimization in a Billion Dimensions via Random Embeddings , 2013, J. Artif. Intell. Res..
[114] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[115] Risto Miikkulainen,et al. Efficient Reinforcement Learning Through Evolving Neural Network Topologies , 2002, GECCO.
[116] Youngchul Sung,et al. Population-Guided Parallel Policy Search for Reinforcement Learning , 2020, ICLR.
[117] Martin J. Wainwright,et al. Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems , 2018, AISTATS.
[118] Olivier Sigaud,et al. Importance mixing: Improving sample reuse in evolutionary policy search methods , 2018, ArXiv.
[119] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[120] Kenneth O. Stanley,et al. Simple Evolutionary Optimization Can Rival Stochastic Gradient Descent in Neural Networks , 2016, GECCO.
[121] J. Geweke,et al. Antithetic acceleration of Monte Carlo integration in Bayesian inference , 1988 .
[122] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[123] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[124] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[125] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[126] Jan Peters,et al. Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.
[127] Paul G. Constantine,et al. Active Subspaces - Emerging Ideas for Dimension Reduction in Parameter Studies , 2015, SIAM spotlights.
[128] Quoc V. Le,et al. Large-Scale Evolution of Image Classifiers , 2017, ICML.
[129] Nikolaus Hansen,et al. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.
[130] Xin Yao,et al. Turning High-Dimensional Optimization Into Computationally Expensive Optimization , 2018, IEEE Transactions on Evolutionary Computation.
[131] Hong Wang,et al. Noisy Derivative-Free Optimization With Value Suppression , 2018, AAAI.
[132] Yang Yu,et al. Sequential Classification-Based Optimization for Direct Policy Search , 2017, AAAI.
[133] Shimon Whiteson,et al. Evolutionary Computation for Reinforcement Learning , 2012, Reinforcement Learning.
[134] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.
[135] Verena Heidrich-Meisner,et al. Neuroevolution strategies for episodic reinforcement learning , 2009, J. Algorithms.
[136] Shie Mannor,et al. A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..
[137] Julian Togelius,et al. Neuroevolution in Games: State of the Art and Open Challenges , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[138] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[139] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[140] Thomas Bartz-Beielstein,et al. Surrogate models for enhancing the efficiency of neuroevolution in reinforcement learning , 2019, GECCO.
[141] Shimon Whiteson,et al. Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning , 2006, AAAI.
[142] Yang Yu,et al. Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning , 2018, AAAI.
[143] Harold J. Kushner,et al. A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .
[144] Nikolaos V. Sahinidis,et al. Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..
[145] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[146] Yang Yu,et al. Scaling Simultaneous Optimistic Optimization for High-Dimensional Non-Convex Functions with Low Effective Dimensions , 2016, AAAI.
[147] John J. Grefenstette,et al. Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..
[148] Yang Yu,et al. Reinforcement Learning with Derivative-Free Exploration , 2019, AAMAS.
[149] Nenghai Yu,et al. Trust Region Evolution Strategies , 2019, AAAI.
[150] Xin Yao,et al. Evolving artificial neural networks , 1999, Proc. IEEE.
[151] Pierre-Yves Oudeyer,et al. GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms , 2017, ICML.
[152] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[153] Jasper Snoek,et al. Multi-Task Bayesian Optimization , 2013, NIPS.
[154] John C. Duchi,et al. Derivative Free Optimization Via Repeated Classification , 2018, AISTATS.
[155] Matthias Poloczek,et al. Scalable Global Optimization via Local Bayesian Optimization , 2019, NeurIPS.
[156] Risto Miikkulainen,et al. A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[157] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[158] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[159] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[160] Brigitte C. Madrian,et al. Reinforcement Learning and Savings Behavior , 2007, The Journal of finance.
[161] Risto Miikkulainen,et al. HyperNEAT-GGP: a hyperNEAT-based atari general game player , 2012, GECCO '12.
[162] Leslie Pack Kaelbling,et al. Bayesian Optimization with Exponential Convergence , 2015, NIPS.
[163] Benjamin Recht,et al. Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.
[164] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .
[165] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).