Jian Qian
发表
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
pdf
Alessandro Lazaric,
Matteo Pirotta,
Ronan Fruit,
2018,
ArXiv.
Martha White,
Daniel Graves,
Matthew Schlegel,
2019,
NeurIPS.
Avrim Blum,
Steve Hanneke,
Han Shao,
2021,
COLT.