TD Learning of Game Evaluation Functions with Hierarchies Neural Architectures
暂无分享,去创建一个
[1] Donato Malerba,et al. Decision Tree Pruning as a Search in the State Space , 1993, ECML.
[2] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[3] Frédéric Gruau,et al. Genetic synthesis of Boolean neural networks with a cell rewriting developmental process , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.
[4] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[5] Frans C. A. Groen,et al. The Optimal Number of Learning Samples and Hidden Units in Function Approximation With a Feedforward , 1993 .
[6] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[7] Volker Tresp,et al. Network Structuring and Training Using Rule-Based Knowledge , 1992, NIPS.
[8] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[9] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.
[10] J Hakalay,et al. 'partition of Unity' Rbf Networks Are Universal Function Approximators , 2022 .
[11] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[12] J. Stephen Judd,et al. Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.
[13] Steven J. Nowlan,et al. Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .
[14] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[15] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[16] Alessandro Sperduti,et al. Speed up learning and network optimization with extended back propagation , 1993, Neural Networks.
[17] Sherif Hashem,et al. Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.
[18] Patrick van der Smagt,et al. Introduction to neural networks , 1995, The Lancet.
[19] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..
[20] P. Dayan,et al. TD(λ) converges with probability 1 , 2004, Machine Learning.
[21] M. Anthony. Uniform convergence and learnability. , 1991 .
[22] Hans J. Berliner,et al. Experiences in Evaluation with BKG - A Program that Plays Backgammon , 1977, IJCAI.
[23] Michael I. Jordan,et al. Hierarchies of Adaptive Experts , 1991, NIPS.
[24] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[25] P. Dayan. The Convergence of TD(λ) for General λ , 2004, Machine Learning.
[26] Dieter Fox,et al. Learning By Error-Driven Decomposition , 1991 .
[27] Henk Corporaal,et al. Variations on the Cascade-Correlation Learning Architecture for Fast Convergence in Robot Control , 1992 .
[28] J. D. Schaffer,et al. Combinations of genetic algorithms and neural networks: a survey of the state of the art , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.
[29] Terrence J. Sejnowski,et al. Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.
[30] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[31] Steven Douglas Whitehead,et al. Reinforcement learning for the adaptive control of perception and action , 1992 .