Grammars for Games: A Gradient-Based, Game-Theoretic Framework for Optimization in Deep Learning
暂无分享,去创建一个
[1] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[2] David Balduzzi,et al. Falsification and Future Performance , 2011, Algorithmic Probability and Friends.
[3] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[4] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[5] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[6] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[7] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[8] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[9] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .
[10] Matemática,et al. Society for Industrial and Applied Mathematics , 2010 .
[11] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.
[12] L. Bottou. From machine learning to machine reasoning , 2011, Machine Learning.
[13] R. Vohra,et al. Calibrated Learning and Correlated Equilibrium , 1996 .
[14] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.
[15] David Balduzzi,et al. Towards a learning-theoretic analysis of spike-timing dependent plasticity , 2012, NIPS.
[16] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[17] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .
[18] Yann LeCun,et al. Open Problem: The landscape of the loss surfaces of multilayer networks , 2015, COLT.
[19] David Balduzzi,et al. Deep Online Convex Optimization by Putting Forecaster to Sleep , 2015, ArXiv.
[20] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[21] Kenneth D. Harris,et al. The Neural Marketplace: I. General Formalism and Linear Theory , 2014, bioRxiv.
[22] James E. Tomberlin,et al. On the Plurality of Worlds. , 1989 .
[23] David Balduzzi,et al. Randomized co-training: from cortical neurons to machine learning and back again , 2013, ArXiv.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[26] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[27] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.
[28] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[29] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[30] Nathan Lay,et al. Supervised Aggregation of Classifiers using Artificial Prediction Markets , 2010, ICML.
[31] Ohad Shamir,et al. On Lower and Upper Bounds in Smooth and Strongly Convex Optimization , 2016, J. Mach. Learn. Res..
[32] Philipp Slusallek,et al. Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.
[33] V. Lamme,et al. The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.
[34] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[35] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[36] Yoshua Bengio,et al. Blocks and Fuel: Frameworks for deep learning , 2015, ArXiv.
[37] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[38] M. Minsky. The Society of Mind , 1986 .
[39] Ohad Shamir,et al. On Lower and Upper Bounds for Smooth and Strongly Convex Optimization Problems , 2015, ArXiv.
[40] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..
[41] Samuel J. Gershman,et al. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines , 2015, Science.
[42] P. Dayan. Twenty-Five Lessons from Computational Neuromodulation , 2012, Neuron.
[43] Gábor Lugosi,et al. Learning correlated equilibria in games with compact sets of strategies , 2007, Games Econ. Behav..
[44] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .
[45] Muhammad Ghifary,et al. Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies , 2015, ArXiv.
[46] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[47] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[48] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[49] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[50] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[51] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[52] Razvan Pascanu,et al. Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.
[53] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[54] Michael P. Wellman,et al. Economic reasoning and artificial intelligence , 2015, Science.
[55] Martin J. Wainwright,et al. Information-theoretic lower bounds on the oracle complexity of convex optimization , 2009, NIPS.
[56] O. G. Selfridge,et al. Pandemonium: a paradigm for learning , 1988 .
[57] David Balduzzi,et al. Cortical prediction markets , 2014, AAMAS.
[58] Barak A. Pearlmutter,et al. Automatic Differentiation of Algorithms for Machine Learning , 2014, ArXiv.
[59] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[60] H. Robbins. A Stochastic Approximation Method , 1951 .
[61] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.
[62] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[63] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[64] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[65] David I. Spivak. The operad of wiring diagrams: formalizing a graphical language for databases, recursion, and plug-and-play circuits , 2013, ArXiv.
[66] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[67] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[68] J. Wickens,et al. Timing is not Everything: Neuromodulation Opens the STDP Gate , 2010, Front. Syn. Neurosci..
[69] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[70] Mark D. Reid,et al. Convergence Analysis of Prediction Markets via Randomized Subspace Descent , 2015, NIPS.
[71] Yoshua Bengio,et al. Blocks and Fuel , 2015 .
[72] Joachim M. Buhmann,et al. Kickback Cuts Backprop's Red-Tape: Biologically Plausible Credit Assignment in Neural Networks , 2014, AAAI.
[73] Pieter R. Roelfsema,et al. Attention-Gated Reinforcement Learning of Internal Representations for Classification , 2005, Neural Computation.
[74] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[75] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[76] Jacob D. Abernethy,et al. A Collaborative Mechanism for Crowdsourcing Prediction Problems , 2011, NIPS.
[77] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[78] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[79] Giulio Tononi,et al. What can neurons do for their brain? Communicate selectivity with bursts , 2013, Theory in Biosciences.
[80] Daniel Cownden,et al. Random feedback weights support learning in deep neural networks , 2014, ArXiv.
[81] X. Jin. Factor graphs and the Sum-Product Algorithm , 2002 .
[82] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[83] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.
[84] Jürgen Schmidhuber,et al. Market-Based Reinforcement Learning in Partially Observable Worlds , 2001, ICANN.
[85] Yoshua Bengio,et al. Difference Target Propagation , 2014, ECML/PKDD.
[86] Edoardo M. Airoldi,et al. Statistical analysis of stochastic gradient methods for generalized linear models , 2014, ICML.
[87] Maxim Raginsky,et al. Information-Based Complexity, Feedback and Dynamics in Convex Programming , 2010, IEEE Transactions on Information Theory.
[88] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[89] Patrick Gallinari,et al. A Framework for the Cooperation of Learning Algorithms , 1990, NIPS.
[90] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[91] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[92] Jan Peters,et al. Policy evaluation with temporal differences: a survey and comparison , 2015, J. Mach. Learn. Res..
[93] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.
[94] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[95] Francis Crick,et al. The recent excitement about neural networks , 1989, Nature.
[96] Andreas Griewank,et al. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.
[97] Haipeng Luo,et al. Fast Convergence of Regularized Learning in Games , 2015, NIPS.
[98] Rafal Butowt,et al. Anterograde axonal transport, transcytosis, and recycling of neurotrophic factors , 2001, Molecular Neurobiology.
[99] Geoffrey J. Gordon. No-regret Algorithms for Online Convex Programs , 2006, NIPS.
[100] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[101] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[102] Jean-Yves Audibert. Optimization for Machine Learning , 1995 .
[103] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[104] Edoardo M. Airoldi,et al. Implicit Temporal Differences , 2014, ArXiv.
[105] D. Rumelhart. Parallel Distributed Processing Volume 1: Foundations , 1987 .
[106] Yoshua Bengio,et al. Deep Learning of Representations: Looking Forward , 2013, SLSP.
[107] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[108] Donald C. Wunsch,et al. Corrections To "Adaptive Critic Designs" , 1997, IEEE Trans. Neural Networks.
[109] Amos J. Storkey,et al. Machine Learning Markets , 2011, AISTATS.
[110] David Balduzzi,et al. Metabolic Cost as an Organizing Principle for Cooperative Learning , 2012, Adv. Complex Syst..
[111] Eric B. Baum,et al. Toward a Model of Intelligence as an Economy of Agents , 1999, Machine Learning.
[112] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .