Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations
暂无分享,去创建一个
Eric Horvitz | Sean Andrist | Debadeepta Dey | Adith Swaminathan | Alekh Agarwal | Besmira Nushi | Aditya Modi | E. Horvitz | Alekh Agarwal | Debadeepta Dey | Aditya Modi | Sean Andrist | Adith Swaminathan | Besmira Nushi
[1] Yi Liu,et al. An Efficient Bandit Algorithm for Realtime Multivariate Optimization , 2017, KDD.
[2] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[3] John Langford,et al. Residual Loss Prediction: Reinforcement Learning With No Incremental Feedback , 2018, ICLR.
[4] James Andrew Bagnell,et al. Learning in modular systems , 2010 .
[5] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[6] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[7] Anil A. Bharath,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Eric Horvitz,et al. Perception, Attention, and Resources: A Decision-Theoretic Approach to Graphics Rendering , 1997, UAI.
[10] Srikanth Kandula,et al. Resource Management with Deep Reinforcement Learning , 2016, HotNets.
[11] Yuxi Li,et al. Deep Reinforcement Learning: An Overview , 2017, ArXiv.
[12] John Langford,et al. A Contextual Bandit Bake-off , 2018, J. Mach. Learn. Res..
[13] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[14] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[15] Samy Bengio,et al. Device Placement Optimization with Reinforcement Learning , 2017, ICML.
[16] Jim Gao,et al. Machine Learning Applications for Data Center Optimization , 2014 .
[17] Deborah Hanus,et al. Smart scheduling : optimizing Tilera's process scheduling via reinforcement learning , 2013 .
[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[19] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[20] Thorsten Joachims,et al. Beyond myopic inference in big data pipelines , 2013, KDD.
[21] Christina Delimitrou,et al. Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.
[22] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[23] Mehmet Demirci,et al. A Survey of Machine Learning Applications for Energy-Efficient Resource Management in Cloud Computing Environments , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).
[24] José Antonio Lozano,et al. A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.
[25] Eric Horvitz,et al. Principles and applications of continual computation , 2001, Artif. Intell..
[26] Ivona Brandic,et al. Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review , 2018, Computing.
[27] Alexandra Fedorova,et al. Operating System Scheduling On Heterogeneous Core Systems , 2007 .
[28] Cheng-Zhong Xu,et al. URL: A unified reinforcement learning approach for autonomic cloud management , 2012, J. Parallel Distributed Comput..
[29] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.
[30] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.