Learning teaching strategies in an Adaptive and Intelligent Educational System through Reinforcement Learning

One of the most important issues in Adaptive and Intelligent Educational Systems (AIES) is to define effective pedagogical policies for tutoring students according to their needs. This paper proposes to use Reinforcement Learning (RL) in the pedagogical module of an educational system so that the system learns automatically which is the best pedagogical policy for teaching students. One of the main characteristics of this approach is its ability to improve the pedagogical policy based only on acquired experience with other students with similar learning characteristics. In this paper we study the learning performance of the educational system through three important issues. Firstly, the learning convergence towards accurate pedagogical policies. Secondly, the role of exploration/exploitation strategies in the application of RL to AIES. Finally, a method for reducing the training phase of the AIES.

[1]  Ioannis Hatzilygeroudis,et al.  A Web-Based Intelligent Tutoring System Using Hybrid Rules as Its Representational Basis , 2002, Intelligent Tutoring Systems.

[2]  Judith D. Wilson,et al.  Artificial Intelligence and Tutoring Systems , 1990 .

[3]  Beverly Park Woolf,et al.  Representing complex knowledge in an intelligent machine tutor , 1987, Comput. Intell..

[4]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[5]  Frank Linton,et al.  Recommender Systems for Learning: Building User and Expert Models through Long-Term Observation of Application Use , 2000, User Modeling and User-Adapted Interaction.

[6]  Vincent Aleven,et al.  The Need for Tutorial Dialog to Support Self-Explanation , 2000 .

[7]  Michael Lebowitz,et al.  Experiments with incremental concept formation: UNIMEM , 2004, Machine Learning.

[8]  Peter Brusilovsky,et al.  Methods and techniques of adaptive hypermedia , 1996, User Modeling and User-Adapted Interaction.

[9]  Kurt VanLehn,et al.  Applications of simulated students: an exploration , 1994 .

[10]  Masayuki Numao,et al.  Multistrategy Discovery and Detection of Novice Programmer Errors , 2004, Machine Learning.

[11]  Sebastian Thrun,et al.  The role of exploration in learning control , 1992 .

[12]  Tom Murray,et al.  Authoring Intelligent Tutoring Systems: An analysis of the state of the art , 1999 .

[13]  Zhendong Niu Bayesian student modeling, user interfaces and feedback: A sensitivity analysis , 2001 .

[14]  Slavomir Stankov,et al.  The Computer Tutor in the New Model of Learning and Teching Control Principles , 1998 .

[15]  Hugh L. Burns,et al.  Foundations of intelligent tutoring systems : an introduction , 1988 .

[16]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[17]  D. Sofge THE ROLE OF EXPLORATION IN LEARNING CONTROL , 1992 .

[18]  Tariq M. Khan,et al.  Pedagogic principles of case-based CAL , 1996 .

[19]  James L. Carroll,et al.  Fixed vs. Dynamic Sub-Transfer in Reinforcement Learning , 2002, ICMLA.

[20]  Joseph E. Beck,et al.  ADVISOR: A Machine Learning Architecture for Intelligent Tutor Construction , 2000, AAAI/IAAI.

[21]  Peter Brusilovsky,et al.  Adaptive and Intelligent Technologies for Web-based Eduction , 1999, Künstliche Intell..

[22]  Ira P. Goldstein,et al.  Overlays: A Theory of Modelling for Computer Aided Instruction, , 1977 .

[23]  Kurt VanLehn,et al.  STEPS: a simulated, tutorable physics student , 1995 .

[24]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[25]  Carolyn Penstein Rosé,et al.  Evaluating the Effectiveness of Tutorial Dialogue Instruction in an Exploratory Learning Context , 2006, Intelligent Tutoring Systems.

[26]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[27]  Raymond J. Mooney,et al.  Refinement-based student modeling and automated bug library construction , 1996 .

[28]  Paloma Martínez,et al.  Err is Human: Building a safer health system , 2003 .

[29]  Elaine Rich,et al.  User Modeling via Stereotypes , 1998, Cogn. Sci..

[30]  Peter Brusilovsky,et al.  ELM-ART: An Intelligent Tutoring System on World Wide Web , 1996, Intelligent Tutoring Systems.

[31]  R. C. Murray A Decision-Theoretic Approach for Selecting Tutorial Discourse Actions , 2001 .

[32]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[33]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[34]  Donald A. Sofge,et al.  Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Stellan Ohlsson,et al.  Automated Cognitive Modeling , 1984, AAAI.

[37]  Paloma Martínez,et al.  Learning to Teach Database Design by Trial and Error , 2002, ICEIS.

[38]  Ramez Elmasri,et al.  Fundamentals of database systems (2nd ed.) , 1994 .

[39]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[40]  John R. Anderson,et al.  The LISP tutor: it approaches the effectiveness of a human tutor , 1985 .

[41]  H. Ulrich Hoppe,et al.  Deductive error diagnosis and inductive error generalization for intelligent tutoring systems , 1994 .

[42]  Etienne Wenger,et al.  Artificial Intelligence and Tutoring Systems: Computational and Cognitive Approaches to the Communication of Knowledge , 1987 .

[43]  Raymund Sison,et al.  Student Modeling and Machine Learning , 1998 .

[44]  Thomas Rist,et al.  WIP/PPP: automatic generation of personalized multimedia presentations , 1997, MULTIMEDIA '96.

[45]  Derek H. Sleeman A System Which Allows Students to Explore Algorithms , 1977, IJCAI.

[46]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[47]  Michael Lebowitz,et al.  Experiments with Incremental Concept Formation: UNIMEM , 1987, Machine Learning.