Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences
暂无分享,去创建一个
[1] Martin Pál,et al. Contextual Multi-Armed Bandits , 2010, AISTATS.
[2] Y. Narahari,et al. A Multiarmed Bandit Incentive Mechanism for Crowdsourcing Demand Response in Smart Grids , 2014, AAAI.
[3] Aleksandrs Slivkins,et al. Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[4] Mingyan Liu,et al. Online Learning of Rested and Restless Bandits , 2011, IEEE Transactions on Information Theory.
[5] Patrick Hummel,et al. Learning and incentives in user-generated content: multi-armed bandits with endogenous arms , 2013, ITCS '13.
[6] Zheng Wen,et al. Matroid Bandits: Fast Combinatorial Optimization with Learning , 2014, UAI.
[7] Nicholas R. Jennings,et al. Efficient crowdsourcing of unknown experts using bounded multi-armed bandits , 2014, Artif. Intell..
[8] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[9] Kazuoki Azuma. WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .
[10] Lillian J. Ratliff,et al. Incentives in the Dark: Multi-armed Bandits for Evolving Users with Unknown Type , 2018, ArXiv.
[11] Mingyan Liu,et al. On the Combinatorial Multi-Armed Bandit Problem with Markovian Rewards , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.
[12] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[13] Andreas Krause,et al. Incentivizing Users for Balancing Bike Sharing Systems , 2015, AAAI.
[14] Yajun Wang,et al. Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms , 2014, J. Mach. Learn. Res..
[15] Steven L. Scott,et al. Multi-armed bandit experiments in the online service economy , 2015 .
[16] J. A. Fill. Eigenvalue bounds on convergence to stationarity for nonreversible markov chains , 1991 .
[17] Alessandro Lazaric,et al. Regret Bounds for Reinforcement Learning with Policy Advice , 2013, ECML/PKDD.
[18] Nicole Immorlica,et al. Social Status and Badge Design , 2015, WWW.
[19] Branislav Kveton,et al. Efficient Learning in Large-Scale Combinatorial Semi-Bandits , 2014, ICML.
[20] S. Shankar Sastry,et al. A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes , 2017, ArXiv.
[21] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[22] H. Kuhn. The Hungarian method for the assignment problem , 1955 .
[23] Elizabeth L. Wilmer,et al. Markov Chains and Mixing Times , 2008 .
[24] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .
[25] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[26] Alessandro Lazaric,et al. Risk-Aversion in Multi-armed Bandits , 2012, NIPS.
[27] Zheng Wen,et al. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2014, AISTATS.
[28] David S. Johnson,et al. Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .