暂无分享,去创建一个
Yuandong Tian | Hongzi Mao | Shaun Singh | Drew Dimmery | Eytan Bakshy | Mohammad Alizadeh | Shannon Chen | Drew Blaisdell | Yuandong Tian | Hongzi Mao | M. Alizadeh | E. Bakshy | Shaun Singh | Drew Dimmery | Shannon Chen | Drew Blaisdell | Mohammad Alizadeh
[1] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[2] Ali C. Begen,et al. An experimental evaluation of rate-adaptation algorithms in adaptive streaming over HTTP , 2011, MMSys.
[3] Hongzi Mao,et al. Variance Reduction for Reinforcement Learning in Input-Driven Environments , 2018, ICLR.
[4] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[5] Iraj Sodagar,et al. The MPEG-DASH Standard for Multimedia Streaming Over the Internet , 2011, IEEE MultiMedia.
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Filip De Turck,et al. A learning-based algorithm for improved bandwidth-awareness of adaptive streaming clients , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).
[8] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Ufuk Topcu,et al. Safe Reinforcement Learning via Shielding , 2017, AAAI.
[10] Eytan Bakshy,et al. Bayesian Optimization for Policy Search via Online-Offline Experimentation , 2019, J. Mach. Learn. Res..
[11] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.
[13] Ramesh K. Sitaraman,et al. Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Designs , 2012, IEEE/ACM Transactions on Networking.
[14] Pieter Abbeel,et al. Towards Characterizing Divergence in Deep Q-Learning , 2019, ArXiv.
[15] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[16] Christian Timmerer,et al. Dynamic adaptive streaming over HTTP dataset , 2012, MMSys '12.
[17] George Zyskind,et al. On Best Linear Estimation and General Gauss-Markov Theorem in Linear Models with Arbitrary Nonnegative Covariance Structure , 1969 .
[18] Peter I. Frazier,et al. A Tutorial on Bayesian Optimization , 2018, ArXiv.
[19] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[20] Te-Yuan Huang,et al. A buffer-based approach to rate adaptation: evidence from a large video streaming service , 2015, SIGCOMM 2015.
[21] Yuandong Tian,et al. ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games , 2017, NIPS.
[22] Zhi-Li Zhang,et al. Vivisecting YouTube: An active measurement study , 2012, 2012 Proceedings IEEE INFOCOM.
[23] Ramesh K. Sitaraman,et al. BOLA: Near-Optimal Bitrate Adaptation for Online Videos , 2016, IEEE/ACM Transactions on Networking.
[24] Hongzi Mao,et al. Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.
[25] Bruno Sinopoli,et al. A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP , 2015, Comput. Commun. Rev..
[26] Michael Fairbank,et al. The divergence of reinforcement learning algorithms with value-iteration and function approximation , 2011, The 2012 International Joint Conference on Neural Networks (IJCNN).
[27] Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.
[28] Craig Boutilier,et al. Non-delusional Q-learning and value-iteration , 2018, NeurIPS.
[29] Filip De Turck,et al. Design of a Q-learning-based client quality selection algorithm for HTTP adaptive video streaming , 2013, ALA 2013.
[30] Vyas Sekar,et al. Understanding the impact of video quality on user engagement , 2011, SIGCOMM.
[31] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[32] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[33] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[34] Guilherme Ottoni,et al. Constrained Bayesian Optimization with Noisy Experiments , 2017, Bayesian Analysis.