Multiagent Reinforcement Learning: Spiking and Nonspiking Agents in the Iterated Prisoner's Dilemma
暂无分享,去创建一个
Chris Christodoulou | Vassilis Vassiliades | Aristodemos Cleanthous | Vassilis Vassiliades | C. Christodoulou | Aristodemos Cleanthous
[1] R. K. Simpson. Nature Neuroscience , 2022 .
[2] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[3] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[4] Michael L. Littman,et al. A hierarchy of prescriptive goals for multiagent learning , 2007, Artif. Intell..
[5] T J Sejnowski,et al. Irregular synchronous activity in stochastically-coupled networks of integrate-and-fire neurons. , 1998, Network.
[6] Xiaohui Xie,et al. Learning in neural networks by reinforcement of irregular spiking. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.
[7] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[8] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..
[9] Xiaolong Ma,et al. Global Reinforcement Learning in Neural Networks , 2007, IEEE Transactions on Neural Networks.
[10] Drew Fudenberg,et al. An economist's perspective on multi-agent learning , 2007, Artif. Intell..
[11] Yoav Shoham,et al. A general criterion and an algorithmic framework for learning in multi-agent systems , 2007, Machine Learning.
[12] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[13] I. Pavlov. Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex , 1929 .
[14] Daniel Kudenko,et al. Adaptive Agents and Multi-Agent Systems , 2003, Lecture Notes in Computer Science.
[15] Peter Stone,et al. Multiagent learning is not the answer. It is the question , 2007, Artif. Intell..
[16] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[17] Simon Parsons,et al. What evolutionary game theory tells us about multiagent learning , 2007, Artif. Intell..
[18] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.
[19] W. Hamilton,et al. The evolution of cooperation. , 1984, Science.
[20] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[21] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[22] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[23] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[24] C. Christodoulou,et al. Is self-control a learned strategy employed by a reward maximizing brain? , 2009, BMC Neuroscience.
[25] Matthew Saffell,et al. Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.
[26] Samuel M. McClure,et al. Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.
[27] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[28] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[29] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.
[30] Gregory S. Kavka. Is Individual Choice Less Problematic than Collective Choice? , 1991, Economics and Philosophy.
[31] R. Lathe. Phd by thesis , 1988, Nature.
[32] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[33] Malvern Lumsden,et al. The Cyprus Conflict as a Prisoner's Dilemma Game , 1973 .
[34] Ronald Smith,et al. The Prisoner's Dilemma and Regime-Switching in the Greek-Turkish Arms Race , 2000 .
[35] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[36] Ralph Neuneier,et al. Multi-agent modeling of multiple FX-markets by neural networks , 2001, IEEE Trans. Neural Networks.
[37] M. McLure. One Hundred Years from Today: Vilfredo Pareto, Manuale di Economia Politica con una Introduzione alla Scienza Sociale, Milan: Societa Editrice Libraria. 1906 , 2006 .
[38] Koichi Moriyama,et al. Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games , 2009, Web Intell. Agent Syst..
[39] Peter Tino,et al. IEEE Transactions on Neural Networks , 2009 .
[40] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[41] Y. Dan,et al. Spike Timing-Dependent Plasticity of Neural Circuits , 2004, Neuron.
[42] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[43] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[44] E. Izhikevich. Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.
[45] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .
[46] Ran Ginosar,et al. Adaptive Cardiac Resynchronization Therapy Device Based on Spiking Neurons Architecture and Reinforcement Learning Scheme , 2007, IEEE Transactions on Neural Networks.
[47] G. Bi,et al. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type , 1998, The Journal of Neuroscience.
[48] A. Rapoport,et al. Prisoner's Dilemma: A Study in Conflict and Co-operation , 1970 .
[49] Daniel Kudenko,et al. Reinforcement Learning of Coordination in Heterogeneous Cooperative Multi-agent Systems , 2005, Adaptive Agents and Multi-Agent Systems.
[50] A. Hodgkin,et al. A quantitative description of membrane current and its application to conduction and excitation in nerve , 1990 .
[51] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[52] Lukasz A. Kurgan,et al. A new synaptic plasticity rule for networks of spiking neurons , 2006, IEEE Transactions on Neural Networks.
[53] Marco Wiering,et al. Convergence and Divergence in Standard and Averaging Reinforcement Learning , 2004, ECML.
[54] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[55] Wulfram Gerstner,et al. Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail , 2009, PLoS Comput. Biol..
[56] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[57] Ronald J. MacGregor,et al. Neural and brain modeling , 1987 .
[58] Chris Christodoulou,et al. Does High Firing Irregularity Enhance Learning? , 2011, Neural Computation.
[59] Vilfredo Pareto,et al. Manuale di economia politica , 1965 .
[60] D. Johnston,et al. Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997 .
[61] Ahmet Sözen,et al. Negotiating a Resolution to the Cyprus Problem: Is Potential European Union Membership a Blessing or a Curse? , 2002 .
[62] Eugene M. Izhikevich,et al. Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.
[63] M. Farries,et al. Reinforcement learning with modulated spike timing dependent synaptic plasticity. , 2007, Journal of neurophysiology.
[64] Geoffrey J. Gordon. Agendas for multi-agent learning , 2007, Artif. Intell..
[65] Kazushi Ikeda,et al. A statistical property of multiagent learning based on Markov decision process , 2006, IEEE Trans. Neural Networks.
[66] Gillian M. Hayes,et al. Evolution of Valence Systems in an Unstable Environment , 2008, SAB.
[67] H. J. Mclaughlin,et al. Learn , 2002 .
[68] Ron Meir,et al. Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule , 2007, Neural Computation.
[69] Jean-Pascal Pfister,et al. Optimal Spike-Timing-Dependent Plasticity for Precise Action Potential Firing in Supervised Learning , 2005, Neural Computation.
[70] Victor R. Lesser,et al. A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics , 2008, J. Artif. Intell. Res..
[71] Alvin E. Roth,et al. Multi-agent learning and the descriptive value of simple models , 2007, Artif. Intell..
[72] F. Charpillet,et al. Efficient Learning in Games , 2006 .
[73] Chris Christodoulou,et al. Multiagent Reinforcement Learning with Spiking and Non-Spiking Agents in the Iterated Prisoner's Dilemma , 2009, ICANN.
[74] Markus Diesmann,et al. A Spiking Neural Network Model of an Actor-Critic Learning Agent , 2009, Neural Computation.
[75] Razvan V. Florian,et al. Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity , 2007, Neural Computation.
[76] Chris Christodoulou,et al. Multiagent Reinforcement Learning in the Iterated Prisoner's Dilemma: Fast cooperation through evolved payoffs , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[77] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[78] David Kraines,et al. The Threshold of Cooperation Among Adaptive Agents: Pavlov and the Stag Hunt , 1996, ATAL.
[79] C. Christodoulou,et al. Self-control with spiking and non-spiking neural networks playing games , 2010, Journal of Physiology-Paris.
[80] M. Dufwenberg. Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.
[81] Guido Bugmann,et al. A Spiking Neuron Model: Applications and Learning , 2002, Neural Networks.
[82] Yoonsuck Choe,et al. Extrapolative Delay Compensation Through Facilitating Synapses and Its Relation to the Flash-Lag Effect , 2008, IEEE Transactions on Neural Networks.
[83] L. Abbott,et al. Synaptic plasticity: taming the beast , 2000, Nature Neuroscience.
[84] Xin Yao,et al. The Iterated Prisoners' Dilemma - 20 Years On , 2007, Advances in Natural Computation.
[85] David H. Ackley,et al. Interactions between learning and evolution , 1991 .
[86] R. J. MacGregor,et al. A model for repetitive firing in neurons , 2004, Kybernetik.
[87] Eugene M. Izhikevich,et al. Which model to use for cortical spiking neurons? , 2004, IEEE Transactions on Neural Networks.
[88] Luigi Fortuna,et al. Learning Anticipation via Spiking Networks: Application to Navigation Control , 2009, IEEE Transactions on Neural Networks.
[89] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[90] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[91] B. Babkin. Conditioned Reflexes; an Investigation of the Physiological Activity of the Cerebral Cortex. , 1929 .
[92] R. Stein. Some models of neuronal variability. , 1967, Biophysical journal.
[93] D. Wilkin,et al. Neuron , 2001, Brain Research.
[94] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[95] M. Nowak,et al. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game , 1993, Nature.
[96] Michael A. Goodrich,et al. Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning , 2011, Machine Learning.
[97] Robert A. Legenstein,et al. A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback , 2008, PLoS Comput. Biol..
[98] Bikramjit Banerjee,et al. Convergent Gradient Ascent in General-Sum Games , 2002, ECML.