Universal parameter optimisation in games based on SPSA
暂无分享,去创建一个
[1] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .
[2] J. Blum. Multidimensional Stochastic Approximation Methods , 1954 .
[3] V. Fabian. On Asymptotic Normality in Stochastic Approximation , 1968 .
[4] Sid Sackson,et al. A Gamut of Games , 1969 .
[5] Selim G. Akl,et al. The principal continuation and the killer heuristic , 1977, ACM '77.
[6] T. Anthony Marsland,et al. Parallel Search of Strongly Ordered Game Trees , 1982, CSUR.
[7] R. Rubinstein,et al. Antithetic Variates, Multivariate Dependence and Simulation of Stochastic Systems , 1985 .
[8] Hung Chen. Lower Rate of Convergence for Locating a Maximum of a Function , 1988 .
[10] P. Glasserman,et al. Some Guidelines and Guarantees for Common Random Numbers , 1992 .
[11] J. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .
[12] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[13] Jonathan Schaeffer,et al. New advances in Alpha-Beta searching , 1996, CSC '96.
[14] H. Jaap van den Herik,et al. Replacement Schemes and Two-Level Tables , 1996, J. Int. Comput. Games Assoc..
[15] Harold J. Kushner,et al. Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.
[16] J. Spall,et al. Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).
[17] Gang George Yin,et al. Budget-Dependent Convergence Rate of Stochastic Approximation , 1995, SIAM J. Optim..
[18] Sigrún Andradóttir,et al. A review of simulation optimization techniques , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).
[19] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[20] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[21] David B. Fogel,et al. Evolving neural networks to play checkers without relying on expert knowledge , 1999, IEEE Trans. Neural Networks.
[22] Ernst A. Heinz. Adaptive Null-Move Pruning , 1999, J. Int. Comput. Games Assoc..
[23] Nicol N. Schraudolph,et al. Local Gain Adaptation in Stochastic Gradient Descent , 1999 .
[24] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[25] J. Spall,et al. Simulation-Based Optimization with Stochastic Approximation Using Common Random Numbers , 1999 .
[26] László Gerencsér,et al. Optimization over discrete sets via SPSA , 1999, WSC '99.
[27] László Gerencsér,et al. Non-smooth optimization via SPSA , 1999 .
[28] James C. Spall,et al. Adaptive stochastic approximation by the simultaneous perturbation method , 2000, IEEE Trans. Autom. Control..
[29] M. Winands. Informed Search in Complex Games , 2000 .
[30] Christian Igel,et al. Improving the Rprop Learning Algorithm , 2000 .
[31] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[32] Jos W. H. M. Uiterwijk,et al. Temporal Difference Learning and the Neural MoveMap Heuristic in the Game of Lines of Action , 2002 .
[33] Manuela Veloso,et al. Scalable Learning in Stochastic Games , 2002 .
[34] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..
[35] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[36] Takashi Chikayama,et al. Game-tree Search Algorithm based on Realization Probability , 2002, J. Int. Comput. Games Assoc..
[37] Michael C. Fu,et al. Randomized-direction stochastic approximation algorithms using deterministic sequences , 2002, Proceedings of the Winter Simulation Conference.
[38] Nicol N. Schraudolph,et al. Towards stochastic conjugate gradient methods , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..
[39] Nicol N. Schraudolph,et al. Conjugate Directions for Stochastic Gradient Descent , 2002, ICANN.
[40] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[41] T. Anthony Marsland,et al. Learning extension parameters in game-tree search , 2003, Inf. Sci..
[42] Michael C. Fu,et al. Convergence of simultaneous perturbation stochastic approximation for nondifferentiable optimization , 2003, IEEE Trans. Autom. Control..
[43] Christian Igel,et al. Empirical evaluation of the improved Rprop learning algorithms , 2003, Neurocomputing.
[44] H. Jaap van den Herik,et al. Two Learning Algorithms for Forward Pruning , 2003, J. Int. Comput. Games Assoc..
[45] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.
[46] H. Jaap van den Herik,et al. An Evaluation Function for Lines of Action , 2003, ACG.
[47] Levente Kocsis. Learning search decisions , 2003 .
[48] Accelerated randomized stochastic optimization , 2003 .
[49] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[50] Tim Hesterberg,et al. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.
[51] H. Jaap van den Herik,et al. The Relative History Heuristic , 2004, Computers and Games.
[52] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[53] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[54] Jonathan Schaeffer,et al. Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.
[55] Andrew Tridgell,et al. Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.
[56] Csaba Szepesvari,et al. Reduced-Variance Payoff Estimation in Adversarial Bandit Problems , 2005 .
[57] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[58] George D. Magoulas,et al. New globally convergent training scheme based on the resilient propagation algorithm , 2005, Neurocomputing.
[59] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[60] Eric Moulines,et al. Comparison of resampling schemes for particle filtering , 2005, ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005..
[61] James heiler,et al. On the choice of random directions for stochastic approximation algorithms , 2006, IEEE Transactions on Automatic Control.
[62] Csaba Szepesvári,et al. RSPSA: Enhanced Parameter Optimization in Games , 2006, ACG.
[63] H. Robbins. A Stochastic Approximation Method , 1951 .
[64] James C. Spall,et al. Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .