Learning Optimal Policies with State Aggregation
暂无分享,去创建一个
Andrea Bonarini | Alessandro Lazaric | Marcello Restelli | A. Lazaric | Andrea Bonarini | Marcello Restelli
[1] A. Lazaric,et al. Learning Optimal Policies using Bound Estimation , 2005 .
[2] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..