A learning algorithm for the Whittle index policy for scheduling web crawlers
暂无分享,去创建一个
[1] Vivek S. Borkar,et al. Low Complexity Online Radio Access Technology Selection Algorithm in LTE-WiFi HetNet , 2020, IEEE Transactions on Mobile Computing.
[2] Qing Zhao,et al. Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access , 2008, IEEE Transactions on Information Theory.
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] John N. Tsitsiklis,et al. Simulation-based optimization of Markov reward processes , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[5] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[6] Vivek S. Borkar,et al. Index Policies for Real-Time Multicast Scheduling for Wireless Broadcast Systems , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.
[7] Vivek S. Borkar,et al. Whittle index policy for crawling ephemeral content , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).
[8] Vivek S. Borkar,et al. Structural Properties of Optimal Transmission Policies Over a Randomly Varying Channel , 2008, IEEE Transactions on Automatic Control.
[9] José Niño-Mora,et al. Sensor scheduling for hunting elusive hiding targets via whittle's restless bandit index policy , 2011, International Conference on NETwork Games, Control and Optimization (NetGCooP 2011).
[10] Laks V. S. Lakshmanan,et al. Learning influence probabilities in social networks , 2010, WSDM '10.
[11] Dafna Shahaf,et al. Tractable near-optimal policies for crawling , 2018, Proceedings of the National Academy of Sciences.
[12] Wei Chu,et al. Refining Recency Search Results with User Click Feedback , 2011, ArXiv.
[13] Vivek S. Borkar,et al. A Structure-aware Online Learning Algorithm for Markov Decision Processes , 2018, VALUETOOLS.
[14] Urtzi Ayesta,et al. Stochastic and fluid index policies for resource allocation problems , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).
[15] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.
[16] Vivek S. Borkar,et al. A reinforcement learning algorithm for restless bandits , 2018, 2018 Indian Control Conference (ICC).
[17] Dimitri P. Bertsekas,et al. Convergence Results for Some Temporal Difference Methods Based on Least Squares , 2009, IEEE Transactions on Automatic Control.
[18] José Niño-Mora,et al. A Dynamic Page-Refresh Index Policy for Web Crawlers , 2014, ASMTA.
[19] E. Feron,et al. Multi-UAV dynamic routing with partial observations using restless bandit allocation indices , 2008, 2008 American Control Conference.
[20] Liudmila Ostroumova,et al. Timely crawling of high-quality ephemeral new content , 2013, CIKM.
[21] Francesco De Pellegrini,et al. Optimal Trunk-Reservation by Policy Learning , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.