Capacity-aware Sequential Recommendations
暂无分享,去创建一个
Mathijs de Weerdt | Matthijs T. J. Spaan | Frits de Nijs | Georgios Theocharous | Nikos Vlassis | N. Vlassis | Georgios Theocharous | M. Spaan | M. D. Weerdt | F. D. Nijs
[1] Liang Tang,et al. Automatic ad format selection via contextual bandits , 2013, CIKM.
[2] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[3] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[4] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[5] Mathijs de Weerdt,et al. Best-Response Planning of Thermostatically Controlled Loads under Power Constraints , 2015, AAAI.
[6] Benjamin Van Roy,et al. Near-optimal Reinforcement Learning in Factored MDPs , 2014, NIPS.
[7] E. Altman. Constrained Markov Decision Processes , 1999 .
[8] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[9] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[10] Matthijs T. J. Spaan,et al. Column Generation Algorithms for Constrained POMDPs , 2018, J. Artif. Intell. Res..
[11] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[12] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[13] Pascal Poupart,et al. Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..
[14] Peter L. Bartlett,et al. Fast-Tracking Stationary MOMDPs for Adaptive Management Problems , 2016, AAAI.
[15] Peter Dayan,et al. Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search , 2013, J. Artif. Intell. Res..
[16] Kee-Eung Kim,et al. Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes , 2015, AAAI.
[17] P. Randolph. Bayesian Decision Problems and Markov Chains , 1968 .
[18] Mathijs de Weerdt,et al. Preallocation and Planning Under Stochastic Resource Constraints , 2018, AAAI.
[19] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[20] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[21] Bamshad Mobasher,et al. Context adaptation in interactive recommender systems , 2014, RecSys '14.
[22] Zheng Wen,et al. An Interactive Points of Interest Guidance System , 2017, IUI Companion.
[23] Shie Mannor,et al. Thompson Sampling for Learning Parameterized Markov Decision Processes , 2014, COLT.
[24] R. Bellman. A Markovian Decision Process , 1957 .
[25] Hoong Chuin Lau,et al. An agent-based simulation approach to experience management in theme parks , 2013, 2013 Winter Simulations Conference (WSC).
[26] Dana Ron,et al. The power of amnesia: Learning probabilistic automata with variable memory length , 1996, Machine Learning.
[27] Ronald A. Howard,et al. Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..
[28] Harald Steck,et al. Evaluation of recommendations: rating-prediction and ranking , 2013, RecSys.
[29] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[30] Mathijs de Weerdt,et al. Bounding the Probability of Resource Constraint Violations in Multi-Agent MDPs , 2017, AAAI.
[31] Olivier Buffet,et al. MOMDPs: A Solution for Modelling Adaptive Management Problems , 2012, AAAI.
[32] E. Silver. MARKOVIAN DECISION PROCESSES WITH UNCERTAIN TRANSITION PROBABILITIES OR REWARDS , 1963 .
[33] Edmund H. Durfee,et al. Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making , 2017, ICAPS.
[34] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[35] David Hsu,et al. Monte Carlo Value Iteration for Continuous-State POMDPs , 2010, WAFR.
[36] David Hsu,et al. Planning under Uncertainty for Robotic Tasks with Mixed Observability , 2010, Int. J. Robotics Res..
[37] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[38] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[39] Alan R. Washburn,et al. The LP/POMDP marriage: Optimization with imperfect information , 2000 .
[40] R. Gomory,et al. A Linear Programming Approach to the Cutting-Stock Problem , 1961 .