Teaching with IMPACT

Like many problems in AI in their general form, supervised learning is computationally intractable. We hypothesize that an important reason humans can learn highly complex and varied concepts, in spite of the computational difficulty, is that they benefit tremendously from experienced and insightful teachers. This paper proposes a new learning framework that provides a role for a knowledgeable, benevolent teacher to guide the process of learning a target concept in a series of "curricular" phases or rounds. In each round, the teacher's role is to act as a moderator, exposing the learner to a subset of the available training data to move it closer to mastering the target concept. Via both theoretical and empirical evidence, we argue that this framework enables simple, efficient learners to acquire very complex concepts from examples. In particular, we provide multiple examples of concept classes that are known to be unlearnable in the standard PAC setting along with provably efficient algorithms for learning them in our extended setting. A key focus of our work is the ability to learn complex concepts on top of simpler, previously learned, concepts---a direction with the potential of creating more competent artificial agents.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[3]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[5]  R.L. Rivest,et al.  A Formal Model of Hierarchical Concept Learning , 1994, Inf. Comput..

[6]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[7]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[10]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[11]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[12]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[13]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[14]  Sandra Zilles,et al.  Models of Cooperative Teaching and Learning , 2011, J. Mach. Learn. Res..

[15]  Rauf Izmailov,et al.  Learning using privileged information: similarity control and knowledge transfer , 2015, J. Mach. Learn. Res..

[16]  Haijia Shi Best-first Decision Tree Learning , 2007 .

[17]  Harald Niederreiter,et al.  Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[18]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[19]  Geoff Holmes,et al.  Multiclass Alternating Decision Trees , 2002, ECML.