Incremental acquisition of behaviors and signs based on a reinforcement learning schemata model and a spike timing-dependent plasticity network
暂无分享,去创建一个
[1] Saori C. Tanaka,et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.
[2] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[3] Stefano Nolfi,et al. Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems , 1998, Neural Networks.
[4] T. Taniguchi,et al. Symbol emergence by combining a reinforcement learning schema model with asymmetric synaptic plasticity , 2006 .
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] L. Abbott,et al. Cortical Development and Remapping through Spike Timing-Dependent Plasticity , 2001, Neuron.
[7] Minoru Asada,et al. Cognitive developmental robotics as a new paradigm for the design of humanoid robots , 2001, Robotics Auton. Syst..
[8] T. Sawaragi,et al. Design and performance of symbols self-organized within an autonomous agent interacting with varied environments , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).
[9] Y. Takahashi,et al. Lexicon Acquisition based on Behavior Learning , 2005, Proceedings. The 4nd International Conference on Development and Learning, 2005..
[10] E. Capaldi,et al. The organization of behavior. , 1992, Journal of applied behavior analysis.
[11] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[12] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[13] H. Markram,et al. Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997, Science.
[14] Michael Davis,et al. The role of the amygdala in fear and anxiety. , 1992, Annual review of neuroscience.
[15] Vishal Soni,et al. Reinforcement learning of hierarchical skills on the sony aibo robot , 2005, AAAI 2005.
[16] L. Abbott,et al. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.
[17] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[18] Steve R. Waterhouse,et al. Constructive Algorithms for Hierarchical Mixtures of Experts , 1995, NIPS.
[19] Mitsuo Kawato,et al. MOSAIC Model for Sensorimotor Learning and Control , 2001, Neural Computation.
[20] G. Bi,et al. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type , 1998, The Journal of Neuroscience.
[21] D M Wolpert,et al. Multiple paired forward and inverse models for motor control , 1998, Neural Networks.
[22] A. Barto,et al. Adaptive Critics and the Basal Ganglia , 1994 .
[23] Jun Tani,et al. Self-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robot experiments using RNNPB , 2004, Neural Networks.
[24] H. Abarbanel,et al. Dynamical model of long-term synaptic plasticity , 2002, Proceedings of the National Academy of Sciences of the United States of America.
[25] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[26] Daniel M. Wolpert,et al. Hierarchical MOSAIC for movement generation , 2003 .
[27] Sander M. Bohte,et al. Reducing Spike Train Variability: A Computational Theory Of Spike-Timing Dependent Plasticity , 2004, BNAIC.
[28] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[29] A. Graybiel,et al. Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[30] O. Hikosaka. Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.
[31] L. Abbott,et al. Synaptic plasticity: taming the beast , 2000, Nature Neuroscience.
[32] G. S. Reynolds. A Primer of Operant Conditioning , 1968 .
[33] Robert A. Legenstein,et al. What Can a Neuron Learn with Spike-Timing-Dependent Plasticity? , 2005, Neural Computation.
[34] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[35] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.
[36] Tetsuo Sawaragi,et al. Self-organization of inner symbols for chase: symbol organization and embodiment , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).
[37] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[38] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.
[39] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[40] Yutaka Sakai,et al. Synaptic regulation on various STDP rules , 2004, Neurocomputing.
[41] Sander M. Bohte,et al. Reducing the Variability of Neural Responses: A Computational Theory of Spike-Timing-Dependent Plasticity , 2007, Neural Computation.
[42] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[43] Tomoki Fukai,et al. A Stochastic Method to Predict the Consequence of Arbitrary Forms of Spike-Timing-Dependent Plasticity , 2003, Neural Computation.
[44] Peter Dayan,et al. Q-learning , 1992, Machine Learning.