暂无分享,去创建一个
Mohit Sharma | Arjun Sharma | Nicholas Rhinehart | Kris M. Kitani | Nicholas Rhinehart | Mohit Sharma | Arjun Sharma
[1] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[2] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[3] Scott Niekum,et al. Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery , 2011, Lifelong Learning.
[4] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[5] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[6] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[7] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[8] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[9] Gerhard Kramer,et al. Directed information for channels with feedback , 1998 .
[10] Andrew G. Barto,et al. Accelerating Reinforcement Learning through the Discovery of Useful Subgoals , 2001 .
[11] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[12] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[13] Oliver Kroemer,et al. Learning to predict phases of manipulation tasks as hidden states , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[14] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[15] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[16] Gregory D. Hager,et al. Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.
[17] Charles Elkan,et al. Expectation Maximization Algorithm , 2010, Encyclopedia of Machine Learning.
[18] Pravesh Ranchod,et al. Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Gaurav S. Sukhatme,et al. Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.
[21] J. Massey. CAUSALITY, FEEDBACK AND DIRECTED INFORMATION , 1990 .
[22] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[23] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[24] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[25] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[26] Ion Stoica,et al. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.
[27] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.