论文信息 - Learning From Instruction And Experience: Methods For Incorporating Procedural Domain Theories Into

Learning From Instruction And Experience: Methods For Incorporating Procedural Domain Theories Into

This thesis defines and evaluates two systems that allow a teacher to provide instructions to a machine learner. My systems, FS scKBANN and scRATLE, expand the language that a teacher may use to provide advice to the learner. In particular, my techniques allow a teacher to give partially correct instructions about procedural tasks--tasks that are solved as sequences of steps. FS scKBANN and scRATLE allow a computer to learn both from instruction and from experience. Experiments with these systems on several testbeds demonstrate that they produce learners that successfully use and refine the instructions they are given. In my initial approach, FS scKBANN, the teacher provides instructions as a set of propositional rules organized around one or more finite-state automata (FSAs). FS scKBANN maps the knowledge in the rules and FSAs into a recurrent neural network. I used FS scKBANN to refine the Chou-Fasman algorithm, a method for solving the secondary-structure prediction problem, a difficult task in molecular biology. FS scKBANN produces a refined algorithm that outperforms the original (non-learning) Chou-Fasman algorithm, as well as a standard neural-network approach. My second system, scRATLE, allows a teacher to communicate advice, using statements in a simple programming language, to a connectionist, reinforcement-learning agent. The teacher indicates conditions of the environment and actions the agent should take under those conditions. scRATLE allows the teacher to give advice continuously by translating the teacher's statements into additions to the agent's neural network. The scRATLE language also includes novel (to the theory-refinement literature) features such as multi-step plans and looping constructs. In experiments with scRATLE on two simulated testbeds involving multiple agents, I demonstrate that a scRATLE agent receiving advice outperforms both an agent that does not receive advice and an agent that receives instruction, but does not refine it. My methods provide an appealing approach for learning from both instruction and experience in procedural tasks. This work widens the "information pipeline" between humans and machine learners, without requiring that the human provide absolutely correct information to the learner.

Richard Maclin