暂无分享,去创建一个
Finale Doshi-Velez | Joseph Futoma | Jianzhun Du | Finale Doshi-Velez | Jianzhun Du | Joseph D. Futoma | F. Doshi-Velez
[1] Yao Liu,et al. Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions , 2020, ICML.
[2] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[5] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[6] Adam M. Oberman,et al. How to train your neural ODE , 2020, ICML.
[7] Adam M. Oberman,et al. How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization , 2020, ICML.
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Edward De Brouwer,et al. GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series , 2019, NeurIPS.
[10] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[11] Yan Liu,et al. Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.
[12] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[13] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[14] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[15] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[16] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[18] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[19] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[20] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[21] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[22] Thorsten Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.
[23] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[24] David Duvenaud,et al. Latent ODEs for Irregularly-Sampled Time Series , 2019, ArXiv.
[25] Pieter Abbeel,et al. Model-Augmented Actor-Critic: Backpropagating through Paths , 2020, ICLR.
[26] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.
[27] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[28] Patrick Kidger,et al. Neural Controlled Differential Equations for Irregular Time Series , 2020, NeurIPS.
[29] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[30] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[31] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..
[32] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[33] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.
[34] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[35] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[36] Jürgen Schmidhuber,et al. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models , 2015, ArXiv.
[37] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[38] Ari Weinstein,et al. Model-based hierarchical reinforcement learning and human action control , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[39] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[40] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[41] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[42] Finale Doshi-Velez,et al. Combining Kernel and Model Based Learning for HIV Therapy Selection , 2017, CRI.
[43] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[44] Stefan Bauer,et al. Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models , 2018, NeurIPS.
[45] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[46] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[47] Tze-Yun Leong,et al. An Efficient Approach to Model-Based Hierarchical Reinforcement Learning , 2017, AAAI.
[48] B. Adams,et al. Dynamic multidrug therapies for hiv: optimal and sti control approaches. , 2004, Mathematical biosciences and engineering : MBE.
[49] David Duvenaud,et al. Learning Differential Equations that are Easy to Solve , 2020, NeurIPS.
[50] Louis Wehenkel,et al. Clinical data based optimal STI strategies for HIV: a reinforcement learning approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.
[51] Jan Peters,et al. Model-based Lookahead Reinforcement Learning , 2019, ArXiv.
[52] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[53] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[54] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[55] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.