A User Comfort Model and Index Policy for Personalizing Discrete Controller Decisions
暂无分享,去创建一个
[1] Martin Ester,et al. TrustWalker: a random walk model for combining trust-based and item-based recommendation , 2009, KDD.
[2] Volkan Cevher,et al. Time-Varying Gaussian Process Bandit Optimization , 2016, AISTATS.
[3] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[4] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.
[5] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.
[6] Joshua A. Taylor,et al. Index Policies for Demand Response , 2014, IEEE Transactions on Power Systems.
[7] Dan J. Kim,et al. A trust-based consumer decision-making model in electronic commerce: The role of trust, perceived risk, and their antecedents , 2019 .
[8] E. Feron,et al. Multi-UAV dynamic routing with partial observations using restless bandit allocation indices , 2008, 2008 American Control Conference.
[9] E. L. Lawler,et al. Branch-and-Bound Methods: A Survey , 1966, Oper. Res..
[10] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[11] Dimitris Bertsimas,et al. Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic , 2000, Oper. Res..
[12] J. Nio-Mora. Restless Bandit Marginal Productivity Indices, Diminishing Returns, and Optimal Control of Make-to-Order/Make-to-Stock M/G/1 Queues , 2006 .
[13] Hiroshi Wakuya,et al. Bottom-up learning of hierarchical models in a class of deterministic POMDP environments , 2015, Int. J. Appl. Math. Comput. Sci..
[14] J. Nino-Mora. A Marginal Productivity Index Policy for the Finite-Horizon Multiarmed Bandit Problem , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.
[15] J. Niño-Mora. RESTLESS BANDITS, PARTIAL CONSERVATION LAWS AND INDEXABILITY , 2001 .
[16] Andreas Krause,et al. Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.
[17] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[18] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[19] José Niño-Mora,et al. Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach , 2002, Math. Program..
[20] José Niño-Mora,et al. Dynamic priority allocation via restless bandit marginal productivity indices , 2007, 2304.06115.
[21] Andreas Krause,et al. Contextual Gaussian Process Bandit Optimization , 2011, NIPS.
[22] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[23] Dimitris Bertsimas,et al. Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems , 1996, Math. Oper. Res..
[24] Andreas Krause,et al. Bayesian optimization for maximum power point tracking in photovoltaic power plants , 2016, 2016 European Control Conference (ECC).