暂无分享,去创建一个
Marcello Restelli | Pierre Liotet | Francesco Vidaich | Alberto Maria Metelli | Marcello Restelli | Francesco Vidaich | A. Metelli | P. Liotet
[1] Joelle Pineau,et al. Combined Reinforcement Learning via Abstract Representations , 2018, AAAI.
[2] Bruce Lee Bowerman,et al. Nonstationary Markov decision processes and related topics in nonstationary Markov chains , 1974 .
[3] Marcello Restelli,et al. Optimistic Policy Optimization via Multiple Importance Sampling , 2019, ICML.
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[6] Marcello Restelli,et al. Tree‐based reinforcement learning for optimal water reservoir operation , 2010 .
[7] Shie Mannor,et al. Contextual Markov Decision Processes , 2015, ArXiv.
[8] Qiang Yang,et al. Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.
[9] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[10] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[11] Sindhu Padakandla,et al. A Survey of Reinforcement Learning Algorithms for Dynamically Varying Environments , 2020, ACM Comput. Surv..
[12] Robert L. Smith,et al. A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes , 2013, Oper. Res..
[13] Falk Lieder,et al. Doing more with less: meta-reasoning and meta-learning in humans and machines , 2019, Current Opinion in Behavioral Sciences.
[14] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[15] Leonidas J. Guibas,et al. Optimally combining sampling techniques for Monte Carlo rendering , 1995, SIGGRAPH.
[16] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[17] Bruno Scherrer,et al. Non-Stationary Approximate Modified Policy Iteration , 2015, ICML.
[18] Joelle Pineau,et al. Decoupling Dynamics and Reward for Transfer Learning , 2018, ICLR.
[19] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, ArXiv.
[20] Marcello Restelli,et al. Policy Optimization via Importance Sampling , 2018, NeurIPS.
[21] Robert L. Smith,et al. Solving nonstationary infinite horizon stochastic production planning problems , 2000, Oper. Res. Lett..
[22] Gerald Tesauro,et al. Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.
[23] Marcello Restelli,et al. Foreign exchange trading: a risk-averse batch reinforcement learning approach , 2020, ICAIF.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[26] Frank Sehnke,et al. Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.
[27] Dit-Yan Yeung,et al. Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making , 2001, Sequence Learning.
[28] Ronald Ortner,et al. Variational Regret Bounds for Reinforcement Learning , 2019, UAI.
[29] M. de Rijke,et al. When People Change their Mind: Off-Policy Evaluation in Non-stationary Recommendation Environments , 2019, WSDM.
[30] Marcello Restelli,et al. Importance Weighted Transfer of Samples in Reinforcement Learning , 2018, ICML.
[31] Marcello Restelli,et al. Importance Sampling Techniques for Policy Optimization , 2020, J. Mach. Learn. Res..
[32] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[33] Paulo Martins Engel,et al. Dealing with non-stationary environments using context detection , 2006, ICML.
[34] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[35] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.