暂无分享,去创建一个
[1] Matthew Riemer,et al. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.
[2] Eric B. Baum,et al. Toward a Model of Mind as a Laissez-Faire Economy of Idiots , 1996, ICML.
[3] Sergey Levine,et al. Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives , 2019, ICLR.
[4] O. G. Selfridge,et al. Pandemonium: a paradigm for learning , 1988 .
[5] V. Braitenberg. Vehicles, Experiments in Synthetic Psychology , 1984 .
[6] David Balduzzi,et al. Cortical prediction markets , 2014, AAMAS.
[7] Hiroshi Kajino,et al. Neuron as an Agent , 2018, ICLR.
[8] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[9] Xiaotie Deng,et al. Settling the complexity of computing two-player Nash equilibria , 2007, JACM.
[10] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[11] John H. Holland,et al. Properties of the Bucket Brigade , 1985, ICGA.
[12] Sergey Levine,et al. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.
[13] Michael I. Jordan,et al. Policy-Gradient Algorithms Have No Guarantees of Convergence in Continuous Action and State Multi-Agent Settings , 2019, ArXiv.
[14] Tim Roughgarden,et al. Twenty Lectures on Algorithmic Game Theory , 2016, Bull. EATCS.
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Marvin Minsky,et al. Society of Mind: A Response to Four Reviews , 1991, Artif. Intell..
[17] Paul W. Goldberg,et al. The complexity of computing a Nash equilibrium , 2006, STOC '06.
[18] Leslie Pack Kaelbling,et al. Modular meta-learning , 2018, CoRL.
[19] Sepp Hochreiter,et al. RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.
[20] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[21] R. A. Brooks,et al. Intelligence without Representation , 1991, Artif. Intell..
[22] Jürgen Schmidhuber,et al. Market-Based Reinforcement Learning in Partially Observable Worlds , 2001, ICANN.
[23] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[24] Thomas L. Griffiths,et al. Automatically Composing Representation Transformations as a Means for Generalization , 2018, ICLR.
[25] Michael I. Jordan,et al. Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games , 2019, AAMAS.
[26] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[27] Jürgen Schmidhuber,et al. Compete to Compute , 2013, NIPS.
[28] Jürgen Schmidhuber,et al. A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .
[29] G. Reeke. The society of mind , 1991 .
[30] William Vickrey,et al. Counterspeculation, Auctions, And Competitive Sealed Tenders , 1961 .
[31] Alexei A. Efros,et al. Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity , 2019, NeurIPS.