The metacognitive loop I: Enhancing reinforcement learning with metacognitive monitoring and control for improved perturbation tolerance
暂无分享,去创建一个
[1] Donald Perlis,et al. On the consistency of commonsense reasoning , 1986, Comput. Intell..
[2] Madhura Nirkhe. Time-Situated Reasoning within Tight Deadlines and Realistic Space and Computation Bounds , 1994, AAAI.
[3] Seiichi Ozawa,et al. Incremental learning in dynamic environments using neural network with long-term memory , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..
[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[5] Lorenzo Peña y Gonzalo. Paraconsistent logic: essays on the inconsistent , 1990 .
[6] H. Kendler,et al. Vertical and horizontal processes in problem solving. , 1962, Psychological review.
[7] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[8] András Lörincz,et al. MDPs: Learning in Varying Environments , 2003, J. Mach. Learn. Res..
[9] Howard H. Kendler,et al. Reversal-shift behavior: Some basic issues. , 1969 .
[10] David R. Traum,et al. Representations of Dialogue State for Domain and Task Independent Meta-Dialogue , 1999, Electron. Trans. Artif. Intell..
[11] Donald Perlis,et al. Systems that detect and repair their own mistakes , 2001 .
[12] Sarit Kraus,et al. How to (Plan to) Meet a Deadline between Now and Then , 1997, J. Log. Comput..
[13] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[14] T. O. Nelson. Consciousness and metacognition. , 1996 .
[15] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[16] Donald Perlis,et al. Logic, Self-awareness and Self-improvement: the Metacognitive Loop and the Problem of Brittleness , 2005, J. Log. Comput..
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Chung Hee Hwang,et al. The TRAINS project: a case study in building a conversational planning agent , 1994, J. Exp. Theor. Artif. Intell..
[19] Graham Priest,et al. Paraconsistent Logic: Essays on the Inconsistent , 1990 .
[20] John Dunlosky,et al. Utilization of Metacognitive Judgments in the Allocation of Study During Multitrial Learning , 1994 .
[21] Donald Perlis,et al. Active Logics: A Unified Formal Approach to Episodic Reasoning , 1999 .
[22] Donald Perlis,et al. Presentations and this and that: logic in action , 1998 .
[23] Donald Perlis,et al. Conversational adequacy: mistakes are the essence , 1998, Int. J. Hum. Comput. Stud..
[24] Eyal Amir,et al. Toward a Formalization of Elaboration Tolerance: Adding and Deleting Axioms , 2001 .
[25] Marco Wiering,et al. Reinforcement Learning in Dynamic Environments using Instantiated Information , 2001, ICML.
[26] Anthony Hunter,et al. Paraconsistent logics , 1998 .
[27] Donald Perlis,et al. RGL Study in a Hybrid Real-time System , 2003, Neural Networks and Computational Intelligence.
[28] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[29] P. N. Johnson-Laird,et al. Talking to computers , 1976, Nature.
[30] Donald Perlis,et al. Reasoning situated in time I: basic concepts , 1990, J. Exp. Theor. Artif. Intell..
[31] Donald Perlis,et al. Step-logic: reasoning situated in time , 1988 .
[32] J. McCarthy. ELABORATION TOLERANCE , 1997 .
[33] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[34] J. Dunlosky,et al. Norms of paired-associate recall during multitrial learning of Swahili-English translation equivalents. , 1994, Memory.
[35] Donald Perlis,et al. Towards domain-independent, task-oriented, conversational adequacy , 2003, IJCAI.