Learning to predict by the methods of temporal differences
暂无分享,去创建一个
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] J. Gillis,et al. Matrix Iterative Analysis , 1961 .
[3] E. Feigenbaum,et al. Computers and Thought , 1963 .
[4] John G. Kemeny,et al. Finite Markov chains , 1960 .
[5] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..
[6] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.
[7] Lashon B. Booker,et al. Intelligent Behavior as an Adaptation to the Task Environment , 1982 .
[8] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[9] Steven Edward Hampson,et al. A neural model of adaptive behavior , 1983 .
[10] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[11] Thomas G. Dietterich,et al. Learning to Predict Sequences , 1985 .
[12] J. Hopfield,et al. The Logic of Limax Learning , 1985 .
[13] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.
[14] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..
[15] J. Christensen. Learning static evaluation functions by linear regression , 1986 .
[16] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[17] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[18] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .
[19] Richard E. Korf,et al. A Unified Theory of Heuristic Evaluation Functions and its Application to Learning , 1986, AAAI.
[20] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.
[21] R. Sutton,et al. Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: Response topography, neuronal firing, and interstimulus intervals , 1986, Behavioural Brain Research.
[22] Jaime G. Carbonell,et al. Machine learning: a guide to current research , 1986 .
[23] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[24] E. Kehoe,et al. Temporal primacy overrides prior training in serial compound conditioning of the rabbit’s nictitating membrane response , 1987 .
[25] Bart W. Stuck,et al. A Computer and Communication Network Performance Analysis Primer (Prentice Hall, Englewood Cliffs, NJ, 1985; revised, 1987) , 1987, Int. CMG Conference.
[26] A. Klopf. A neuronal model of classical conditioning , 1988 .
[27] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..
[28] Eric V. Denardo,et al. Dynamic Programming: Models and Applications , 2003 .
[29] S. Hampson,et al. Disjunctive models of Boolean category learning , 1987, Biological Cybernetics.