Anticipatory Classifier System with Average Reward Criterion in Discretized Multi-Step Environments
暂无分享,去创建一个
[1] M. Puterman. Chapter 8 Markov decision processes , 1990 .
[2] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[3] Dan Xia,et al. Learning classifier system with average reward reinforcement learning , 2013, Knowl. Based Syst..
[4] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[5] Stewart W. Wilson. Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.
[6] Satinder P. Singh,et al. Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes , 1994, AAAI.
[7] Joachim Hoffmann,et al. Lernmechanismen zum Erwerb verhaltenssteuernden Wissens , 2000 .
[8] Sridhar Mahadevan,et al. Sensitive Discount Optimality: Unifying Discounted and Average Reward Reinforcement Learning , 1996, ICML.
[9] Martin V. Butz,et al. Biasing Exploration in an Anticipatory Learning Classifier System , 2001, IWLCS.
[10] Stewart W. Wilson. ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.
[11] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .
[12] Wolfgang Stolzmann,et al. An Introduction to Anticipatory Classifier Systems , 1999, Learning Classifier Systems.
[13] Jason H. Moore,et al. ExSTraCS 2.0: description and evaluation of a scalable learning classifier system , 2015, Evolutionary Intelligence.
[14] Keivan Borna,et al. Customer satisfaction prediction with Michigan-style learning classifier system , 2019 .
[15] Olgierd Unold,et al. Introducing Action Planning to the Anticipatory Classifier System ACS2 , 2019, CORES.
[16] Pierre Collet,et al. BACS: A Thorough Study of Using Behavioral Sequences in ACS2 , 2020, PPSN.
[17] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..