暂无分享,去创建一个
[1] Edoardo M. Airoldi,et al. Statistical analysis of stochastic gradient methods for generalized linear models , 2014, ICML.
[2] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[3] Yoshua Bengio,et al. Deep Learning of Representations: Looking Forward , 2013, SLSP.
[4] David Balduzzi,et al. Metabolic Cost as an Organizing Principle for Cooperative Learning , 2012, Adv. Complex Syst..
[5] Maxim Raginsky,et al. Information-Based Complexity, Feedback and Dynamics in Convex Programming , 2010, IEEE Transactions on Information Theory.
[6] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[7] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[8] Jürgen Schmidhuber,et al. Market-Based Reinforcement Learning in Partially Observable Worlds , 2001, ICANN.
[9] Amos J. Storkey,et al. Machine Learning Markets , 2011, AISTATS.
[10] Samuel J. Gershman,et al. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines , 2015, Science.
[11] P. Dayan. Twenty-Five Lessons from Computational Neuromodulation , 2012, Neuron.
[12] Gábor Lugosi,et al. Learning correlated equilibria in games with compact sets of strategies , 2007, Games Econ. Behav..
[13] Andreas Griewank,et al. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.
[14] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[15] Haipeng Luo,et al. Fast Convergence of Regularized Learning in Games , 2015, NIPS.
[16] H. Robbins. A Stochastic Approximation Method , 1951 .
[17] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..
[18] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.
[19] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[20] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[21] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[22] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.
[23] David Balduzzi,et al. Towards a learning-theoretic analysis of spike-timing dependent plasticity , 2012, NIPS.
[24] Yoshua Bengio,et al. Difference Target Propagation , 2014, ECML/PKDD.
[25] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[26] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[27] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[28] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[29] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[30] Eric B. Baum,et al. Toward a Model of Intelligence as an Economy of Agents , 1999, Machine Learning.
[31] Nathan Lay,et al. Supervised Aggregation of Classifiers using Artificial Prediction Markets , 2010, ICML.
[32] Yoshua Bengio,et al. Blocks and Fuel: Frameworks for deep learning , 2015, ArXiv.
[33] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .
[34] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .
[35] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[36] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[37] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[38] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[39] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[40] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[41] 守屋 悦朗,et al. J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .
[42] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .
[43] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[44] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.
[45] Daniel Cownden,et al. Random feedback weights support learning in deep neural networks , 2014, ArXiv.
[46] L. Bottou. From machine learning to machine reasoning , 2011, Machine Learning.
[47] R. Vohra,et al. Calibrated Learning and Correlated Equilibrium , 1996 .
[48] Philipp Slusallek,et al. Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.
[49] Geoffrey E. Hinton,et al. Learning representations by back-propagation errors, nature , 1986 .
[50] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[51] Edoardo M. Airoldi,et al. Implicit Temporal Differences , 2014, ArXiv.
[52] Muhammad Ghifary,et al. Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies , 2015, ArXiv.
[53] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[54] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[55] Patrick Gallinari,et al. A Framework for the Cooperation of Learning Algorithms , 1990, NIPS.
[56] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[57] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[58] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[59] O. G. Selfridge,et al. Pandemonium: a paradigm for learning , 1988 .
[60] M. Minsky. The Society of Mind , 1986 .
[61] Ohad Shamir,et al. On Lower and Upper Bounds for Smooth and Strongly Convex Optimization Problems , 2015, ArXiv.
[62] David Balduzzi,et al. Cortical prediction markets , 2014, AAMAS.
[63] David I. Spivak. The operad of wiring diagrams: formalizing a graphical language for databases, recursion, and plug-and-play circuits , 2013, ArXiv.
[64] Joachim M. Buhmann,et al. Kickback Cuts Backprop's Red-Tape: Biologically Plausible Credit Assignment in Neural Networks , 2014, AAAI.
[65] Jan Peters,et al. Policy evaluation with temporal differences: a survey and comparison , 2015, J. Mach. Learn. Res..
[66] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[67] Giulio Tononi,et al. What can neurons do for their brain? Communicate selectivity with bursts , 2013, Theory in Biosciences.
[68] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.
[69] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[70] David Balduzzi,et al. Falsification and Future Performance , 2011, Algorithmic Probability and Friends.
[71] X. Jin. Factor graphs and the Sum-Product Algorithm , 2002 .
[72] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[73] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[74] Mark D. Reid,et al. Convergence Analysis of Prediction Markets via Randomized Subspace Descent , 2015, NIPS.
[75] D. Rumelhart. Parallel Distributed Processing Volume 1: Foundations , 1987 .
[76] Rafal Butowt,et al. Anterograde axonal transport, transcytosis, and recycling of neurotrophic factors , 2001, Molecular Neurobiology.
[77] Geoffrey J. Gordon. No-regret Algorithms for Online Convex Programs , 2006, NIPS.
[78] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[79] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[80] Jean-Yves Audibert. Optimization for Machine Learning , 1995 .
[81] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[82] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[83] J. Wickens,et al. Timing is not Everything: Neuromodulation Opens the STDP Gate , 2010, Front. Syn. Neurosci..
[84] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[85] James E. Tomberlin,et al. On the Plurality of Worlds. , 1989 .
[86] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[87] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.
[88] Yann LeCun,et al. Open Problem: The landscape of the loss surfaces of multilayer networks , 2015, COLT.
[89] David Balduzzi,et al. Deep Online Convex Optimization by Putting Forecaster to Sleep , 2015, ArXiv.
[90] David Balduzzi,et al. Randomized co-training: from cortical neurons to machine learning and back again , 2013, ArXiv.
[91] Jacob D. Abernethy,et al. A Collaborative Mechanism for Crowdsourcing Prediction Problems , 2011, NIPS.
[92] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[93] Michael P. Wellman,et al. Economic reasoning and artificial intelligence , 2015, Science.
[94] Martin J. Wainwright,et al. Information-theoretic lower bounds on the oracle complexity of convex optimization , 2009, NIPS.
[95] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[96] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[97] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[98] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[99] Barak A. Pearlmutter,et al. Automatic Differentiation of Algorithms for Machine Learning , 2014, ArXiv.
[100] Kenneth D. Harris,et al. The Neural Marketplace: I. General Formalism and Linear Theory , 2014, bioRxiv.
[101] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[102] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.