Optimal Teaching for Online Perceptrons

Consider a teacher designing a good lecture for students, or a hacker drafting a poisonous text input against Tay the chatterbot. Both cases can be formulated as a task of constructing special training data, such that a known learning algorithm taking the constructed data will arrive at a prespecified target model. This task is known as optimal teaching, which has a wide range of applications in education, psychology, computer security, program synthesis, etc.. Prior analysis of optimal teaching focused exclusively on batch learners. However, a theoretical understand of optimal teaching for online (sequential) learners is also important for many applications. This paper presents the first study of optimal teaching for an online learner, specifically the perceptron. We show how to construct the shortest input sequence for a perceptron, and prove that the sequence has length one when the teacher knows everything about the perceptron, or length three when the teacher does not know the initial weight vector of the perceptron.

[1]  Dana Angluin,et al.  Teachers, learners and black boxes , 1997, COLT '97.

[2]  Xiaojin Zhu,et al.  The Label Complexity of Mixed-Initiative Classifier Training , 2016, ICML.

[3]  A. Thomaz,et al.  Mixed-Initiative Active Learning , 2012 .

[4]  Dana Angluin,et al.  Learning from Different Teachers , 2004, Machine Learning.

[5]  Sally A. Goldman,et al.  Teaching a Smarter Learner , 1996, J. Comput. Syst. Sci..

[6]  Xiaojin Zhu,et al.  The Security of Latent Dirichlet Allocation , 2015, AISTATS.

[7]  Shai Ben-David,et al.  Self-Directed Learning and Its Relation to the VC-Dimension and to Teacher-Directed Learning , 2004, Machine Learning.

[8]  Ayumi Shinohara,et al.  Teachability in computational learning , 1990, New Generation Computing.

[9]  Sanjit A. Seshia,et al.  A theory of formal synthesis via inductive learning , 2015, Acta Informatica.

[10]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[11]  Hans Ulrich Simon,et al.  Recursive teaching dimension, VC-dimension and sample compression , 2014, J. Mach. Learn. Res..

[12]  Sandra Zilles,et al.  Models of Cooperative Teaching and Learning , 2011, J. Mach. Learn. Res..

[13]  Thomas Zeugmann,et al.  Teaching Randomized Learners , 2006, COLT.

[14]  Frank J. Balbach,et al.  Measuring teachability using variants of the teaching dimension , 2008, Theor. Comput. Sci..

[15]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[16]  Xiaojin Zhu,et al.  Machine Teaching for Bayesian Learners in the Exponential Family , 2013, NIPS.

[17]  Xiaojin Zhu,et al.  The Teaching Dimension of Linear Learners , 2015, ICML.

[18]  Dana Angluin,et al.  Queries revisited , 2001, Theoretical Computer Science.

[19]  Bradley C. Love,et al.  Optimal Teaching for Limited-Capacity Human Learners , 2014, NIPS.

[20]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[21]  Paul Barford,et al.  Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.

[22]  Thomas Zeugmann,et al.  Recent Developments in Algorithmic Teaching , 2009, LATA.

[23]  Ronald L. Rivest,et al.  Being taught can be faster than asking questions , 1995, COLT '95.

[24]  M. Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[25]  Xiaojin Zhu,et al.  Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.

[26]  Bilge Mutlu,et al.  How Do Humans Teach: On Curriculum Learning and Teaching Dimension , 2011, NIPS.

[27]  Ayumi Shinohara,et al.  Complexity of Teaching by a Restricted Number of Examples , 2009, COLT.

[28]  H. David Mathias,et al.  A Model of Interactive Teaching , 1997, J. Comput. Syst. Sci..