暂无分享,去创建一个
[1] Henry C. Ellis,et al. Transfer of Learning , 2021, Research in Mathematics Education.
[2] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[3] J. Murphy. Technical Analysis of the Futures Markets: A Comprehensive Guide to Trading Methods and Applications , 1986 .
[4] J. Hull. Options, Futures, and Other Derivatives , 1989 .
[5] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[6] Raymond Kurzweil,et al. Age of intelligent machines , 1990 .
[7] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[8] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[9] Richard S. Sutton,et al. Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.
[10] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[11] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[12] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[13] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.
[14] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[15] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[16] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[17] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[18] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.
[19] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[20] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[21] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[22] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[23] Ralph Neuneier,et al. Enhancing Q-Learning for Optimal Asset Allocation , 1997, NIPS.
[24] Jonathan Schaeffer. One Jump Ahead , 1997 .
[25] T. Crystal. Conversational speech recognition , 1997 .
[26] Randy Goebel,et al. Computational intelligence - a logical approach , 1998 .
[27] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[28] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[29] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[30] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[31] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[32] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[33] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[34] Andrew W. Lo,et al. Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation , 2000 .
[35] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[36] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[37] John N. Tsitsiklis,et al. Regression methods for pricing complex American-style options , 2001, IEEE Trans. Neural Networks.
[38] Francis A. Longstaff,et al. Valuing American Options by Simulation: A Simple Least-Squares Approach , 2001 .
[39] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[40] Matthew Saffell,et al. Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.
[41] Matthew L. Ginsberg,et al. GIB: Imperfect Information in a Computationally Challenging Game , 2011, J. Artif. Intell. Res..
[42] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[43] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[44] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[45] Luis M. Viceira,et al. Appendix for "Strategic Asset Allocation: Portfolio Choice for Long-Term Investors" , 2001 .
[46] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[47] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[48] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..
[49] Bernhard Schölkopf,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.
[50] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[51] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[52] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[53] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[54] Paul Glasserman,et al. Monte Carlo Methods in Financial Engineering , 2003 .
[55] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[56] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[57] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[58] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[59] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[60] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[61] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[62] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[63] Hector J. Levesque,et al. Knowledge Representation and Reasoning , 2004 .
[64] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[65] Victor R. Lesser,et al. A survey of multi-agent organizational paradigms , 2004, The Knowledge Engineering Review.
[66] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[67] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[68] A. Lo. The Adaptive Markets Hypothesis , 2004 .
[69] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[70] Robert Givan,et al. Relational Reinforcement Learning: An Overview , 2004, ICML 2004.
[71] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[72] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[73] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[74] Simon Haykin,et al. Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.
[75] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[76] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[77] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[78] Christos Dimitrakakis,et al. TORCS, The Open Racing Car Simulator , 2005 .
[79] Richard S. Sutton,et al. Temporal Abstraction in Temporal-difference Networks , 2005, NIPS.
[80] Xiaotie Deng,et al. Settling the Complexity of Two-Player Nash Equilibrium , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[81] S. Schaal. Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics , 2006 .
[82] Rich Caruana,et al. Model compression , 2006, KDD '06.
[83] Toby Walsh,et al. Handbook of Constraint Programming , 2006, Handbook of Constraint Programming.
[84] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[85] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[86] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[87] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[88] Stefan Schaal,et al. Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.
[89] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..
[90] Ben Taskar,et al. Introduction to statistical relational learning , 2007 .
[91] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[92] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[93] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[94] David Thue,et al. Interactive Storytelling: A Player Modelling Approach , 2007, AIIDE.
[95] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[96] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[97] Jonathan Schaeffer,et al. Checkers Is Solved , 2007, Science.
[98] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.
[99] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[100] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[101] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[102] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[103] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[104] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[105] Yoav Shoham,et al. Essentials of Game Theory: A Concise Multidisciplinary Introduction , 2008, Essentials of Game Theory: A Concise Multidisciplinary Introduction.
[106] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[107] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[108] Xiaojin Zhu,et al. Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[109] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[110] Ah Chung Tsoi,et al. The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.
[111] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[112] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[113] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[114] Dale Schuurmans,et al. Learning Exercise Policies for American Options , 2009, AISTATS.
[115] VARUN CHANDOLA,et al. Anomaly detection: A survey , 2009, CSUR.
[116] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[117] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[118] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[119] Yaoliang Yu,et al. A General Projection Property for Distribution Families , 2009, NIPS.
[120] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[121] Ricardo Vilalta,et al. Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.
[122] Robert Tibshirani,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.
[123] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[124] Masashi Sugiyama,et al. Nonparametric Return Distribution Approximation for Reinforcement Learning , 2010, ICML.
[125] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[126] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[127] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[128] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[129] Vern Paxson,et al. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.
[130] Masashi Sugiyama,et al. Parametric Return Density Estimation for Reinforcement Learning , 2010, UAI.
[131] Joelle Pineau,et al. Informing sequential clinical decision-making through reinforcement learning: an empirical study , 2010, Machine Learning.
[132] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[133] Anind K. Dey,et al. Modeling Interaction via the Principle of Maximum Causal Entropy , 2010, ICML.
[134] A. Lo,et al. Consumer Credit Risk Models Via Machine-Learning Algorithms , 2010 .
[135] Warren B. Powell,et al. Feature Article - Merging AI and OR to Solve High-Dimensional Stochastic Optimization Problems Using Approximate Dynamic Programming , 2010, INFORMS J. Comput..
[136] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[137] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[138] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[139] Warren B. Powell,et al. Adaptive Stochastic Control for the Smart Grid , 2011, Proceedings of the IEEE.
[140] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[141] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.
[142] Regina Barzilay,et al. Learning to Win by Reading Manuals in a Monte-Carlo Framework , 2011, ACL.
[143] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.
[144] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[145] Lihong Li,et al. Sample Complexity Bounds of Exploration , 2012, Reinforcement Learning.
[146] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.
[147] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[148] D. Yen,et al. Identifying the signs of fraudulent accounts using data mining techniques , 2012, Comput. Hum. Behav..
[149] Xi Fang,et al. 3. Full Four-channel 6.3-gb/s 60-ghz Cmos Transceiver with Low-power Analog and Digital Baseband Circuitry 7. Smart Grid — the New and Improved Power Grid: a Survey , 2022 .
[150] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[151] M. Kosorok,et al. Q-LEARNING WITH CENSORED DATA. , 2012, Annals of statistics.
[152] Michèle Sebag,et al. The grand challenge of computer Go , 2012, Commun. ACM.
[153] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[154] Masashi Sugiyama,et al. Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting , 2012, ICML.
[155] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[156] Michael H. Bowling,et al. Tractable Objectives for Robust Policy Optimization , 2012, NIPS.
[157] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[158] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[159] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[160] Santiago Ontañón,et al. A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.
[161] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[162] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[163] David Silver,et al. Concurrent Reinforcement Learning from Customer Interactions , 2013, ICML.
[164] Liljana Gavrilovska,et al. Learning and Reasoning in Cognitive Radio Networks , 2013, IEEE Communications Surveys & Tutorials.
[165] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[166] Baher Abdulhai,et al. Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.
[167] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.
[168] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[169] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.
[170] Li Deng,et al. Speech-Centric Information Processing: An Optimization-Oriented Approach , 2013, Proceedings of the IEEE.
[171] Daniela M. Witten,et al. An Introduction to Statistical Learning: with Applications in R , 2013 .
[172] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[173] Xiao Li,et al. Machine Learning Paradigms for Speech Recognition: An Overview , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[174] Max Kuhn,et al. Applied Predictive Modeling , 2013 .
[175] Shih-Chieh Huang,et al. MoHex 2.0: A Pattern-Based MCTS Hex Player , 2013, Computers and Games.
[176] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.
[177] Tom Fawcett,et al. Data science for business , 2013 .
[178] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[179] Suchi Saria,et al. A $3 Trillion Challenge to Computational Scientists: Transforming Healthcare Delivery , 2014, IEEE Intelligent Systems.
[180] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[181] Shalabh Bhatnagar,et al. Universal Option Models , 2014, NIPS.
[182] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[183] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.
[184] Peter Dayan,et al. Bayes-Adaptive Simulation-based Search with Value Function Approximation , 2014, NIPS.
[185] Richard S. Sutton,et al. Weighted importance sampling for off-policy learning with linear function approximation , 2014, NIPS.
[186] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[187] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[188] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[189] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.
[190] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[191] Wu He,et al. Internet of Things in Industries: A Survey , 2014, IEEE Transactions on Industrial Informatics.
[192] Hwee Pink Tan,et al. Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications , 2014, IEEE Communications Surveys & Tutorials.
[193] S. Murphy,et al. Dynamic Treatment Regimes. , 2014, Annual review of statistics and its application.
[194] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[195] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[196] Zoran Popovic,et al. Trading Off Scientific Knowledge and User Learning with Multi-Armed Bandits , 2014, EDM.
[197] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[198] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[199] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[200] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[201] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[202] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[203] Zheng Wen,et al. Optimal Demand Response Using Device-Based Reinforcement Learning , 2014, IEEE Transactions on Smart Grid.
[204] Philip S. Thomas,et al. Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees , 2015, IJCAI.
[205] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[206] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.
[207] Christopher D. Manning,et al. Advances in natural language processing , 2015, Science.
[208] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[209] Svetlana Lazebnik,et al. Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[210] Alessandro Lazaric,et al. Maximum Entropy Semi-Supervised Inverse Reinforcement Learning , 2015, IJCAI.
[211] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[212] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[213] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[214] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[215] Kyunghyun Cho,et al. Natural Language Understanding with Distributed Representation , 2015, ArXiv.
[216] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[217] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[218] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[219] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.
[220] Jason Weston,et al. Memory Networks , 2014, ICLR.
[221] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.
[222] Jiajun Wu,et al. Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.
[223] Koray Kavukcuoglu,et al. Multiple Object Recognition with Visual Attention , 2014, ICLR.
[224] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[225] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.
[226] Dianhai Yu,et al. Multi-Task Learning for Multiple Language Translation , 2015, ACL.
[227] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.
[228] Bolei Zhou,et al. Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.
[229] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[230] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[231] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[232] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[233] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[234] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[235] Regina Barzilay,et al. Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.
[236] Luís Paulo Reis,et al. Model-Based Relative Entropy Stochastic Search , 2016, NIPS.
[237] Michael I. Jordan,et al. Machine learning: Trends, perspectives, and prospects , 2015, Science.
[238] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[239] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[240] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[241] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.
[242] Ivan Laptev,et al. Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[243] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[244] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[245] M. Kosorok,et al. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine , 2015 .
[246] Pedro M. Domingos,et al. The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World , 2015 .
[247] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[248] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[249] Michal Valko,et al. Bayesian Policy Gradient and Actor-Critic Algorithms , 2016, J. Mach. Learn. Res..
[250] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[251] Yuandong Tian,et al. Better Computer Go Player with Neural Network and Long-term Prediction , 2016, ICLR.
[252] George Saon,et al. The IBM 2016 English Conversational Telephone Speech Recognition System , 2016, INTERSPEECH.
[253] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[254] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.
[255] Stefano Ermon,et al. Model-Free Imitation Learning with Policy Optimization , 2016, ICML.
[256] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.
[257] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[258] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.
[259] Christopher D. Manning,et al. Learning Language Games through Interaction , 2016, ACL.
[260] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[261] Philip H. S. Torr,et al. Playing Doom with SLAM-Augmented Deep Reinforcement Learning , 2016, ArXiv.
[262] Srikanth Kandula,et al. Resource Management with Deep Reinforcement Learning , 2016, HotNets.
[263] Michael C. Fu,et al. Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control , 2015, ICML.
[264] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[265] Kavosh Asadi,et al. A New Softmax Operator for Reinforcement Learning , 2016, ArXiv.
[266] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[267] Tie-Yan Liu,et al. Dual Learning for Machine Translation , 2016, NIPS.
[268] David Pfau,et al. Connecting Generative Adversarial Networks and Actor-Critic Methods , 2016, ArXiv.
[269] Geoffrey E. Hinton,et al. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.
[270] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[271] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[272] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[273] Ian J. Goodfellow,et al. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .
[274] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[275] Le Song,et al. Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.
[276] Cristian Sminchisescu,et al. Reinforcement Learning for Visual Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[277] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[278] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[279] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[280] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[281] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[282] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[283] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[284] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[285] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[286] Murat Kantarcioglu,et al. Adversarial Data Mining: Big Data Meets Cyber Security , 2016, CCS.
[287] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[288] Tor Lattimore,et al. Causal Bandits: Learning Good Interventions via Causal Inference , 2016, NIPS.
[289] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[290] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[291] Shan Carter,et al. Attention and Augmented Recurrent Neural Networks , 2016 .
[292] Maosong Sun,et al. Semi-Supervised Learning for Neural Machine Translation , 2016, ACL.
[293] Justin A. Sirignano. Deep learning for limit order books , 2016, Quantitative Finance.
[294] Jianfeng Gao,et al. Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads , 2016, EMNLP.
[295] Jianfeng Gao,et al. Deep Reinforcement Learning with a Natural Language Action Space , 2015, ACL.
[296] Jing He,et al. Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.
[297] Masayoshi Tomizuka,et al. Algorithmic safety measures for intelligent industrial co-robots , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[298] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[299] Marina Krakovsky. Reinforcement renaissance , 2016, Commun. ACM.
[300] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[301] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[302] Samy Bengio,et al. Can Active Memory Replace Attention? , 2016, NIPS.
[303] Uri Shalit,et al. Learning Representations for Counterfactual Inference , 2016, ICML.
[304] Philip Bachman,et al. Natural Language Comprehension with the EpiReader , 2016, EMNLP.
[305] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[306] David Vandyke,et al. On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems , 2016, ACL.
[307] Alexander M. Rush,et al. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.
[308] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.
[309] Shuicheng Yan,et al. Tree-Structured Reinforcement Learning for Sequential Object Localization , 2016, NIPS.
[310] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[311] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[312] Abhinav Gupta. Supersizing Self-Supervision: Learning Perception and Action Without Human Supervision , 2016 .
[313] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[314] Taghi M. Khoshgoftaar,et al. A survey of transfer learning , 2016, Journal of Big Data.
[315] Yu Zhang,et al. Personalizing a Dialogue System with Transfer Learning , 2016, ArXiv.
[316] Kyunghyun Cho,et al. End-to-End Goal-Driven Web Navigation , 2016, NIPS.
[317] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[318] Md. Mustafizur Rahman,et al. Neural Information Retrieval: A Literature Review , 2016, ArXiv.
[319] Regina Barzilay,et al. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning , 2016, EMNLP.
[320] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.
[321] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[322] Michael I. Jordan,et al. Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.
[323] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.
[324] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[325] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[326] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[327] Yelong Shen,et al. ReasoNet: Learning to Stop Reading in Machine Comprehension , 2016, CoCo@NIPS.
[328] Geoffrey E. Hinton,et al. Using Fast Weights to Attend to the Recent Past , 2016, NIPS.
[329] Jim Duggan,et al. An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.
[330] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[331] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[332] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[333] Martha White,et al. An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning , 2015, J. Mach. Learn. Res..
[334] Alex Graves,et al. Associative Long Short-Term Memory , 2016, ICML.
[335] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[336] Nicolas Usunier,et al. Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks , 2016, ArXiv.
[337] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[338] Gökhan Tür,et al. End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding , 2016, INTERSPEECH.
[339] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[340] Marek Petrik,et al. Proximal Gradient Temporal Difference Learning Algorithms , 2016, IJCAI.
[341] J. Pearl,et al. Causal inference in statistics , 2016 .
[342] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[343] Maxine Eskénazi,et al. Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning , 2016, SIGDIAL Conference.
[344] Rudolf Kadlec,et al. Text Understanding with the Attention Sum Reader Network , 2016, ACL.
[345] Josef Urban,et al. DeepMath - Deep Sequence Models for Premise Selection , 2016, NIPS.
[346] Martha White,et al. Investigating Practical Linear Temporal Difference Learning , 2016, AAMAS.
[347] Jing He,et al. A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems , 2016, INTERSPEECH.
[348] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[349] Sergey Levine,et al. Collective robot reinforcement learning with distributed asynchronous guided policy search , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[350] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[351] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.
[352] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.
[353] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[354] Kirthevasan Kandasamy,et al. Batch Policy Gradient Methods for Improving Neural Conversation Models , 2017, ICLR.
[355] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[356] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[357] Lei Zhang,et al. Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.
[358] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[359] Matthew R. G. Brown,et al. Learning stable and predictive network-based patterns of schizophrenia and its clinical symptoms , 2017, npj Schizophrenia.
[360] Siqi Liu,et al. Improved Image Captioning via Policy Gradient optimization of SPIDEr , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[361] Quoc V. Le,et al. Neural Optimizer Search with Reinforcement Learning , 2017, ICML.
[362] Zhao Chen,et al. The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI , 2017, ArXiv.
[363] Liang Lin,et al. Attention-Aware Face Hallucination via Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[364] Wolfram Burgard,et al. Deep reinforcement learning with successor features for navigation across similar environments , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[365] Percy Liang,et al. From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood , 2017, ACL.
[366] Philip S. Yu,et al. Learning Multiple Tasks with Multilinear Relationship Networks , 2015, NIPS.
[367] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[368] Peter Stone,et al. Intrinsically motivated model learning for developing curious robots , 2017, Artif. Intell..
[369] Oliver Brock,et al. Interactive Perception: Leveraging Action in Perception and Perception in Action , 2016, IEEE Transactions on Robotics.
[370] Bhaskara Marthi,et al. A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs , 2017, Science.
[371] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[372] Razvan Pascanu,et al. Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.
[373] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[374] Ji Feng,et al. Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.
[375] Jason Weston,et al. Learning through Dialogue Interactions by Asking Questions , 2016, ICLR.
[376] Ameet Talwalkar,et al. Federated Multi-Task Learning , 2017, NIPS.
[377] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[378] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.
[379] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[380] Graham Neubig,et al. Neural Machine Translation and Sequence-to-sequence Models: A Tutorial , 2017, ArXiv.
[381] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[382] José M. F. Moura,et al. Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.
[383] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[384] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[385] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[386] Zachary C. Lipton,et al. Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals , 2017, ArXiv.
[387] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[388] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[389] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[390] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[391] Fan Yang,et al. Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.
[392] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[393] Nicholas Rhinehart,et al. First-Person Activity Forecasting with Online Inverse Reinforcement Learning , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[394] Misha Denil,et al. Learning to Perform Physics Experiments via Deep Reinforcement Learning , 2016, ICLR.
[395] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[396] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[397] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[398] Zheng Zhang,et al. Saliency-based Sequential Image Attention with Multiset Prediction , 2017, NIPS.
[399] Bart De Schutter,et al. Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.
[400] Sergey Levine,et al. Generalizing Skills with Semi-Supervised Reinforcement Learning , 2016, ICLR.
[401] Stephen Tyree,et al. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU , 2016, ICLR.
[402] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[403] Chen Liang,et al. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.
[404] Lihong Li,et al. Stochastic Variance Reduction Methods for Policy Evaluation , 2017, ICML.
[405] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[406] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.
[407] Tom Schaul,et al. Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017 , 2017, 1711.08378.
[408] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[409] Ramakanth Pasunuru,et al. Reinforced Video Captioning with Entailment Rewards , 2017, EMNLP.
[410] Yuandong Tian,et al. ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games , 2017, NIPS.
[411] Yedid Hoshen,et al. VAIN: Attentional Multi-agent Predictive Modeling , 2017, NIPS.
[412] Tim Rocktäschel,et al. End-to-end Differentiable Proving , 2017, NIPS.
[413] Anca D. Dragan,et al. Inverse Reward Design , 2017, NIPS.
[414] Wang Ling,et al. Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.
[415] Byron Boots,et al. Predictive-State Decoders: Encoding the Future into Recurrent Networks , 2017, NIPS.
[416] Jitendra Malik,et al. Learning to Optimize Neural Nets , 2017, ArXiv.
[417] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[418] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[419] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[420] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[421] Lihong Li,et al. Neuro-Symbolic Program Synthesis , 2016, ICLR.
[422] Jianfeng Gao,et al. End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.
[423] Gang Hua,et al. Collaborative Deep Reinforcement Learning for Joint Object Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[424] Marijn F. Stollenga,et al. Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots , 2017, Artif. Intell..
[425] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.
[426] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[427] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[428] Aurko Roy,et al. Learning to Remember Rare Events , 2017, ICLR.
[429] Yann Dauphin,et al. Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.
[430] Vaibhava Goel,et al. Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[431] Damien Ernst,et al. Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives , 2017 .
[432] Ramesh Raskar,et al. Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.
[433] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[434] Nahum Shimkin,et al. Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning , 2016, ICML.
[435] Qinru Qiu,et al. A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[436] Nan Jiang,et al. Repeated Inverse Reinforcement Learning , 2017, NIPS.
[437] Richard E. Turner,et al. Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning , 2017, NIPS.
[438] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[439] Peng Peng,et al. Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.
[440] A. Ng,et al. MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs. , 2017 .
[441] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[442] Tom M. Mitchell,et al. Leveraging Knowledge Bases in LSTMs for Improving Machine Reading , 2017, ACL.
[443] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[444] Junjie Yan,et al. Practical Network Blocks Design with Q-Learning , 2017, ArXiv.
[445] Eric P. Xing,et al. Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[446] Don Monroe. Deep learning takes on translation , 2017, Commun. ACM.
[447] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[448] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.
[449] Wenhan Xiong,et al. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning , 2017, EMNLP.
[450] Yang Liu,et al. Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening , 2016, ICLR.
[451] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[452] Sebastian Nowozin,et al. DeepCoder: Learning to Write Programs , 2016, ICLR.
[453] David Sontag,et al. Learning a Health Knowledge Graph from Electronic Medical Records , 2017, Scientific Reports.
[454] Balaraman Ravindran,et al. Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain , 2015, ICLR.
[455] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[456] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[457] Gabriel Synnaeve,et al. STARDATA: A StarCraft AI Research Dataset , 2017, AIIDE.
[458] Kyunghyun Cho,et al. Task-Oriented Query Reformulation with Reinforcement Learning , 2017, EMNLP.
[459] Dileep George,et al. Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.
[460] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[461] Jin Young Choi,et al. Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[462] Oladimeji Farri,et al. Diagnostic Inferencing via Improving Clinical Concept Extraction with Deep Reinforcement Learning: A Preliminary Study , 2017, MLHC.
[463] Lukasz Kaiser,et al. One Model To Learn Them All , 2017, ArXiv.
[464] Tuomas Sandholm,et al. Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.
[465] Geoffrey Zweig,et al. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.
[466] Byron Boots,et al. Predictive State Recurrent Neural Networks , 2017, NIPS.
[467] Kai-Uwe Kühnberger,et al. Neural-Symbolic Learning and Reasoning: A Survey and Interpretation , 2017, Neuro-Symbolic Artificial Intelligence.
[468] Yuxi Li,et al. Deep Reinforcement Learning: An Overview , 2017, ArXiv.
[469] Richard Socher,et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[470] Luke S. Zettlemoyer,et al. Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.
[471] Lawrence D. Jackel,et al. Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car , 2017, ArXiv.
[472] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[473] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[474] Deva Ramanan,et al. Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[475] David Berthelot,et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.
[476] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.
[477] Bhaskar Mitra,et al. Neural Models for Information Retrieval , 2017, ArXiv.
[478] Alexander Knapp,et al. Transferring Context-Dependent Test Inputs , 2017, 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS).
[479] Geoffrey Zweig,et al. The microsoft 2016 conversational speech recognition system , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[480] Jacob biamonte,et al. Quantum machine learning , 2016, Nature.
[481] Yuandong Tian,et al. Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.
[482] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.
[483] Masayoshi Tomizuka,et al. Designing the Robot Behavior for Safe Human–Robot Interactions , 2017 .
[484] Ming Zhou,et al. Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.
[485] Misha Denil,et al. Learned Optimizers that Scale and Generalize , 2017, ICML.
[486] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[487] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.
[488] Rachid Guerraoui,et al. Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning , 2017, NIPS.
[489] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.
[490] Peng Zhang,et al. IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.
[491] Samy Bengio,et al. Device Placement Optimization with Reinforcement Learning , 2017, ICML.
[492] Tom M. Mitchell,et al. What can machine learning do? Workforce implications , 2017, Science.
[493] Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[494] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[495] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.
[496] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[497] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[498] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[499] Ning Zhang,et al. Deep Reinforcement Learning-Based Image Captioning with Embedding Reward , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[500] Kam-Fai Wong,et al. Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning , 2017, ArXiv.
[501] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[502] Sandy H. Huang,et al. Adversarial Attacks on Neural Network Policies , 2017, ICLR.
[503] Padhraic Smyth,et al. Science and data science , 2017, Proceedings of the National Academy of Sciences.
[504] Nando de Freitas,et al. Robust Imitation of Diverse Behaviors , 2017, NIPS.
[505] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[506] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[507] Marc G. Bellemare,et al. The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.
[508] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[509] Yuan Li,et al. Learning how to Active Learn: A Deep Reinforcement Learning Approach , 2017, EMNLP.
[510] Dawn Song,et al. Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.
[511] Dirk Ormoneit,et al. Kernel-Based Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.
[512] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[513] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[514] Ser-Nam Lim,et al. A Reinforcement Learning Approach to the View Planning Problem , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[515] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.
[516] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[517] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[518] Yann LeCun,et al. Model-Based Planning in Discrete Action Spaces , 2017, ArXiv.
[519] Randy H. Katz,et al. A Berkeley View of Systems Challenges for AI , 2017, ArXiv.
[520] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[521] Joel Z. Leibo,et al. A multi-agent reinforcement learning model of common-pool resource appropriation , 2017, NIPS.
[522] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .
[523] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[524] Olivier Pietquin,et al. End-to-end optimization of goal-driven and visually grounded dialogue systems , 2017, IJCAI.
[525] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[526] Jiajun Wu,et al. Neural Scene De-rendering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[527] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[528] Martín Abadi,et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.
[529] Cezary Kaliszyk,et al. Deep Network Guided Proof Search , 2017, LPAR.
[530] Richard Socher,et al. Dynamic Coattention Networks For Question Answering , 2016, ICLR.
[531] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[532] Jitendra Malik,et al. Learning to Optimize , 2016, ICLR.
[533] Joshua B. Tenenbaum,et al. Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning , 2017, ArXiv.
[534] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[535] Ping Tan,et al. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[536] Jiwen Lu,et al. 3DCNN-DQN-RNN: A Deep Reinforcement Learning Framework for Semantic Parsing of Large-Scale 3D Point Clouds , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[537] Jiwen Lu,et al. Attention-Aware Deep Reinforcement Learning for Video Face Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[538] Le Song,et al. 2 Common Formulation for Greedy Algorithms on Graphs , 2018 .
[539] Julie A. Shah,et al. C-LEARN: Learning geometric constraints from demonstrations for multi-step manipulation in shared autonomy , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[540] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.
[541] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.
[542] Kunle Olukotun,et al. Infrastructure for Usable Machine Learning: The Stanford DAWN Project , 2017, ArXiv.
[543] Jianfeng Gao,et al. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.
[544] Bernhard Schölkopf,et al. Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[545] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[546] Tom Michael Mitchell,et al. Track how technology is transforming work , 2017, Nature.
[547] Yiwei Zhang,et al. Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce , 2018, AAAI.
[548] Terrence J. Sejnowski,et al. Glider soaring via reinforcement learning in the field , 2018, Nature.
[549] Shuai Li,et al. TopRank: A practical algorithm for online stochastic ranking , 2018, NeurIPS.
[550] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.
[551] Lawrence V. Snyder,et al. Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.
[552] Nenghai Yu,et al. Model-Level Dual Learning , 2018, ICML.
[553] Joshua B. Tenenbaum,et al. End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.
[554] Kenneth O. Stanley,et al. Safe mutations for deep and recurrent neural networks through output gradients , 2017, GECCO.
[555] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[556] Xin Wang,et al. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling , 2018, ACL.
[557] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[558] Tommi S. Jaakkola,et al. Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.
[559] Martha White,et al. Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control , 2018, ICML.
[560] Jitendra Malik,et al. SFV , 2018, ACM Trans. Graph..
[561] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.
[562] Utkarsh Upadhyay,et al. Deep Reinforcement Learning of Marked Temporal Point Processes , 2018, NeurIPS.
[563] Yang Cai,et al. Learning Safe Policies with Expert Guidance , 2018, NeurIPS.
[564] Eric Xing,et al. Deep Generative Models with Learnable Knowledge Constraints , 2018, NeurIPS.
[565] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.
[566] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.
[567] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[568] Le Song,et al. SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation , 2017, ICML.
[569] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[570] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[571] Jeffrey Dean,et al. Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.
[572] Tao Chen,et al. Hardware Conditioned Policies for Multi-Robot Transfer Learning , 2018, NeurIPS.
[573] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[574] Clare Lyle,et al. GAN Q-learning , 2018, ArXiv.
[575] Amir-massoud Farahmand,et al. Iterative Value-Aware Model Learning , 2018, NeurIPS.
[576] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents (Extended Abstract) , 2018, IJCAI.
[577] Arvind Satyanarayan,et al. The Building Blocks of Interpretability , 2018 .
[578] Xiaoyan Zhu,et al. Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.
[579] Sanja Fidler,et al. NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.
[580] Ji Feng,et al. AutoEncoder by Forest , 2017, AAAI.
[581] Craig Boutilier,et al. Data center cooling using model-predictive control , 2018, NeurIPS.
[582] Christopher D. Manning,et al. Compositional Attention Networks for Machine Reasoning , 2018, ICLR.
[583] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[584] Samuel J Gershman,et al. The Successor Representation: Its Computational Logic and Neural Substrates , 2018, The Journal of Neuroscience.
[585] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[586] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[587] Shuai Wang,et al. Deep learning for sentiment analysis: A survey , 2018, WIREs Data Mining Knowl. Discov..
[588] Joel Z. Leibo,et al. Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.
[589] Sergey Levine,et al. Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.
[590] Martin Müller,et al. Memory-Augmented Monte Carlo Tree Search , 2018, AAAI.
[591] Adnan Darwiche,et al. Human-level intelligence or animal-like abilities? , 2017, Commun. ACM.
[592] Shie Mannor,et al. Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning , 2018, NeurIPS.
[593] Shane Legg,et al. Reward learning from human preferences and demonstrations in Atari , 2018, NeurIPS.
[594] Karen Simonyan,et al. The challenge of realistic music generation: modelling raw audio at scale , 2018, NeurIPS.
[595] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[596] Gang Pan,et al. Knowledge-Guided Agent-Tactic-Aware Learning for StarCraft Micromanagement , 2018, IJCAI.
[597] Ole Winther,et al. Recurrent Relational Networks , 2017, NeurIPS.
[598] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[599] Jie Zhang,et al. Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing , 2018, NeurIPS.
[600] D. Sculley,et al. Winner's Curse? On Pace, Progress, and Empirical Rigor , 2018, ICLR.
[601] Douglas Eck,et al. A Neural Representation of Sketch Drawings , 2017, ICLR.
[602] Sergey Levine,et al. Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.
[603] Byron Boots,et al. Dual Policy Iteration , 2018, NeurIPS.
[604] Le Song,et al. Boosting the Actor with Dual Critic , 2017, ICLR.
[605] Pascal Poupart,et al. Unsupervised Video Object Segmentation for Deep Reinforcement Learning , 2018, NeurIPS.
[606] Pieter Abbeel,et al. Learning Plannable Representations with Causal InfoGAN , 2018, NeurIPS.
[607] Yao Liu,et al. Representation Balancing MDPs for Off-Policy Policy Evaluation , 2018, NeurIPS.
[608] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[609] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.
[610] Honglak Lee,et al. Multitask Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies , 2018, NIPS 2018.
[611] Quanshi Zhang,et al. Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.
[612] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[613] Richard Evans,et al. Learning Explanatory Rules from Noisy Data , 2017, J. Artif. Intell. Res..
[614] Amir Hussain,et al. Applications of Deep Learning and Reinforcement Learning to Biological Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[615] Olexandr Isayev,et al. Deep reinforcement learning for de novo drug design , 2017, Science Advances.
[616] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[617] Nathan Kallus,et al. Confounding-Robust Policy Improvement , 2018, NeurIPS.
[618] Joel Z. Leibo,et al. Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.
[619] Xinlei Chen,et al. Iterative Visual Reasoning Beyond Convolutions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[620] Liang Zhang,et al. Deep reinforcement learning for page-wise recommendations , 2018, RecSys.
[621] Roger Wattenhofer,et al. Teaching a Machine to Read Maps with Deep Reinforcement Learning , 2017, AAAI.
[622] Craig Boutilier,et al. Non-delusional Q-learning and value-iteration , 2018, NeurIPS.
[623] Tie-Yan Liu,et al. Neural Architecture Optimization , 2018, NeurIPS.
[624] Hector Geffner,et al. Model-free, Model-based, and General Intelligence , 2018, IJCAI.
[625] Xiaohua Zhai,et al. The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.
[626] Gary Marcus,et al. Deep Learning: A Critical Appraisal , 2018, ArXiv.
[627] Razvan Pascanu,et al. Relational Deep Reinforcement Learning , 2018, ArXiv.
[628] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.
[629] Geoffrey E. Hinton,et al. Matrix capsules with EM routing , 2018, ICLR.
[630] Stuart J. Russell,et al. Meta-Learning MCMC Proposals , 2017, NeurIPS.
[631] Qingquan Song,et al. Efficient Neural Architecture Search with Network Morphism , 2018, ArXiv.
[632] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.
[633] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.
[634] Kai-Fu Lee. AI Superpowers: China, Silicon Valley, and the New World Order , 2018 .
[635] Joelle Pineau,et al. A Survey of Available Corpora for Building Data-Driven Dialogue Systems , 2015, Dialogue Discourse.
[636] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[637] Byron Boots,et al. Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.
[638] Zheng Wang,et al. Machine Learning in Compiler Optimization , 2018, Proceedings of the IEEE.
[639] Chong Wang,et al. Subgoal Discovery for Hierarchical Dialogue Policy Learning , 2018, EMNLP.
[640] Lei Li,et al. Reinforced Co-Training , 2018, NAACL.
[641] Sergey Levine,et al. Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[642] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[643] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[644] Joelle Pineau,et al. RE-EVALUATE: Reproducibility in Evaluating Reinforcement Learning Algorithms , 2018 .
[645] John Miller,et al. When Recurrent Models Don't Need To Be Recurrent , 2018, ArXiv.
[646] Zhanxing Zhu,et al. Reinforced Continual Learning , 2018, NeurIPS.
[647] Xin Wang,et al. Video Captioning via Hierarchical Reinforcement Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[648] Baochun Li,et al. Post: Device Placement with Cross-Entropy Minimization and Proximal Policy Optimization , 2018, NeurIPS.
[649] Kirthevasan Kandasamy,et al. Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.
[650] George Papandreou,et al. Searching for Efficient Multi-Scale Architectures for Dense Image Prediction , 2018, NeurIPS.
[651] Yujing Hu,et al. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.
[652] Benjamin Van Roy,et al. Scalable Coordinated Exploration in Concurrent Reinforcement Learning , 2018, NeurIPS.
[653] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[654] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[655] Soumik Sarkar,et al. Online Robust Policy Learning in the Presence of Unknown Adversaries , 2018, NeurIPS.
[656] Yin Zhou,et al. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[657] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[658] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.
[659] Oriol Vinyals,et al. Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.
[660] H. Francis Song,et al. Machine Theory of Mind , 2018, ICML.
[661] Raia Hadsell,et al. Learning to Navigate in Cities Without a Map , 2018, NeurIPS.
[662] Rémi Munos,et al. Learning to Search with MCTSnets , 2018, ICML.
[663] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[664] Jianfeng Gao,et al. BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems , 2016, AAAI.
[665] Russ Tedrake,et al. Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation , 2018, NeurIPS.
[666] Zhuoran Yang,et al. Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.
[667] Fei Wang,et al. Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..
[668] Martin Müller,et al. Move Prediction Using Deep Convolutional Neural Networks in Hex , 2018, IEEE Transactions on Games.
[669] Patrick M. Pilarski,et al. Accelerating Learning in Constructive Predictive Frameworks with the Successor Representation , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[670] Cezary Kaliszyk,et al. Reinforcement Learning of Theorem Proving , 2018, NeurIPS.
[671] Vladlen Koltun,et al. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.
[672] Samuel J. Gershman,et al. Human-in-the-Loop Interpretability Prior , 2018, NeurIPS.
[673] Hyrum S. Anderson,et al. Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning , 2018, ArXiv.
[674] Zachary C. Lipton,et al. The mythos of model interpretability , 2018, Commun. ACM.
[675] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[676] Pierre Baldi,et al. Solving the Rubik's Cube Without Human Knowledge , 2018, ArXiv.
[677] Doina Precup,et al. Learning with Options that Terminate Off-Policy , 2017, AAAI.
[678] Peter Stone,et al. Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..
[679] Chuang Gan,et al. Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding , 2018, NeurIPS.
[680] Kagan Tumer,et al. Evolutionary Reinforcement Learning , 2018, NIPS 2018.
[681] Sergey Levine,et al. Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017, ArXiv.
[682] Xue-Xin Wei,et al. Emergence of grid-like representations by training recurrent neural networks to perform spatial localization , 2018, ICLR.
[683] Yee Whye Teh,et al. An Analysis of Categorical Distributional Reinforcement Learning , 2018, AISTATS.
[684] Emma Brunskill,et al. Strategic Object Oriented Reinforcement Learning , 2018, ArXiv.
[685] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[686] Chen Liang,et al. Memory Augmented Policy Optimization for Program Synthesis with Generalization , 2018, ArXiv.
[687] Eric P. Xing,et al. Gated Path Planning Networks , 2018, ICML.
[688] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NIPS 2018.
[689] Nicholas Jing Yuan,et al. XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music , 2018, KDD.
[690] Stefano Ermon,et al. Multi-Agent Generative Adversarial Imitation Learning , 2018, NeurIPS.
[691] Carl Doersch,et al. Learning Visual Question Answering by Bootstrapping Hard Attention , 2018, ECCV.
[692] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[693] OpenAI. Learning Dexterous In-Hand Manipulation. , 2018 .
[694] William Yang Wang,et al. Deep Reinforcement Learning for NLP , 2018, ACL.
[695] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.
[696] Neil Houlsby,et al. Transfer Learning with Neural AutoML , 2018, NeurIPS.
[697] Alexandre M. Bayen,et al. Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning , 2017, IEEE Transactions on Intelligent Transportation Systems.
[698] J. Pearl,et al. The Book of Why: The New Science of Cause and Effect , 2018 .
[699] Trevor Darrell,et al. BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.
[700] Shi Dong,et al. An Information-Theoretic Analysis of Thompson Sampling for Large Action Spaces , 2018, NIPS 2018.
[701] William E. Byrd,et al. Neural Guided Constraint Logic Programming for Program Synthesis , 2018, NeurIPS.
[702] Shimon Whiteson,et al. TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning , 2017, ICLR.
[703] Kristen Grauman,et al. Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[704] Eric P. Xing,et al. On Unifying Deep Generative Models , 2017, ICLR.
[705] Eneko Agirre,et al. Unsupervised Neural Machine Translation , 2017, ICLR.
[706] F. Viégas,et al. Deep learning of aftershock patterns following large earthquakes , 2018, Nature.
[707] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[708] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[709] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
[710] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[711] Lihong Li,et al. Adversarial Attacks on Stochastic Bandits , 2018, NeurIPS.
[712] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[713] Thierry Moreau,et al. Learning to Optimize Tensor Programs , 2018, NeurIPS.
[714] Le Song,et al. Learning Temporal Point Processes via Reinforcement Learning , 2018, NeurIPS.
[715] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .
[716] Fangkai Yang,et al. PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making , 2018, IJCAI.
[717] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[718] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[719] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[720] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[721] Yu Zhang,et al. Learning to Multitask , 2018, NeurIPS.
[722] Michael I. Jordan,et al. Generalized Zero-Shot Learning with Deep Calibration Network , 2018, NeurIPS.
[723] José M. F. Moura,et al. Adversarial Multiple Source Domain Adaptation , 2018, NeurIPS.
[724] Eric P. Xing,et al. Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation , 2018, NeurIPS.
[725] Richard S. Sutton,et al. Multi-step Reinforcement Learning: A Unifying Algorithm , 2017, AAAI.
[726] Koray Kavukcuoglu,et al. Neural scene representation and rendering , 2018, Science.
[727] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[728] Rico Sennrich,et al. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.
[729] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[730] Liang Zhang,et al. Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning , 2018, KDD.
[731] Chris Dyer,et al. On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.
[732] Sergey Levine,et al. One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning , 2018, Robotics: Science and Systems.
[733] Luc De Raedt,et al. DeepProbLog: Neural Probabilistic Logic Programming , 2018, BNAIC/BENELEARN.
[734] Tim Kraska,et al. The Case for Learned Index Structures , 2018 .
[735] Peter W. Glynn,et al. Multi-agent Online Learning with Asynchronous Feedback Loss , 2018, NIPS 2018.
[736] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[737] Nicholas Jing Yuan,et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.
[738] Nan Jiang,et al. Hierarchical Imitation and Reinforcement Learning , 2018, ICML.
[739] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.
[740] Geoffrey J. Gordon,et al. Learning Beam Search Policies via Imitation Learning , 2018, NeurIPS.
[741] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[742] Pan He,et al. Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[743] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..
[744] Hamed Haddadi,et al. Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.
[745] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[746] Gaëtan Hadjeres,et al. Deep Learning Techniques for Music Generation , 2019 .
[747] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[748] Yang Yu,et al. Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning , 2018, AAAI.
[749] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[750] Zachary C. Lipton,et al. Troubling Trends in Machine Learning Scholarship , 2018, ACM Queue.
[751] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[752] Alexei A. Efros,et al. Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[753] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[754] Yang Liu,et al. THUMT: An Open-Source Toolkit for Neural Machine Translation , 2017, AMTA.
[755] Julian Togelius,et al. Deep Learning for Video Game Playing , 2017, IEEE Transactions on Games.
[756] Ruocheng Guo,et al. A Survey of Learning Causality with Data , 2018, ACM Comput. Surv..
[757] Turgay Celik,et al. Toward a Smart Cloud: A Review of Fault-Tolerance Methods in Cloud Systems , 2018, IEEE Transactions on Services Computing.
[758] Ufuk Topcu,et al. Constrained Cross-Entropy Method for Safe Reinforcement Learning , 2020, IEEE Transactions on Automatic Control.
[759] De,et al. Relational Reinforcement Learning , 2022 .