No Learner Left Behind: On the Complexity of Teaching Multiple Learners Simultaneously

We present a theoretical study of machine teaching in the setting where the teacher must use the same training set to teach multiple learners. This problem is a theoretical abstraction of the real-world classroom setting in which the teacher delivers the same lecture to academically diverse students. We define a minimax teaching criterion to guarantee the performance of the worst learner in the class. We prove that the teaching dimension increases with class diversity. For the classes of conjugate Bayesian learners and linear regression learners, respectively, we exhibit corresponding minimax teaching set. We then propose a method to enhance teaching by partitioning the class into sections. We present cases where the optimal partition minimizes aggregate teaching dimension while maintaining the guarantee of performance on all learners. Interestingly, we show personalized education (one learner per section) is not necessarily the optimal partition. Our results generalize machine teaching to multiple learners and offer insight on how to teach large classes.

[1]  Harold Pashler,et al.  Optimizing Instructional Policies , 2013, NIPS.

[2]  Dana Angluin,et al.  Queries revisited , 2001, Theoretical Computer Science.

[3]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[4]  Bradley C. Love,et al.  Optimal Teaching for Limited-Capacity Human Learners , 2014, NIPS.

[5]  Ruth C. Carter,et al.  Teachers , 2002, Global Education Monitoring Report 2020.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Andreas Krause,et al.  Near-Optimally Teaching the Crowd to Classify , 2014, ICML.

[8]  Xiaojin Zhu,et al.  The Label Complexity of Mixed-Initiative Classifier Training , 2016, ICML.

[9]  Thomas Zeugmann,et al.  Recent Developments in Algorithmic Teaching , 2009, LATA.

[10]  Emma Brunskill,et al.  The Impact on Individualizing Student Models on Necessary Practice Opportunities , 2012, EDM.

[11]  Ayumi Shinohara,et al.  Complexity of Teaching by a Restricted Number of Examples , 2009, COLT.

[12]  H. David Mathias,et al.  A Model of Interactive Teaching , 1997, J. Comput. Syst. Sci..

[13]  Ayumi Shinohara,et al.  Teachability in computational learning , 1990, New Generation Computing.

[14]  Xiaojin Zhu,et al.  Machine Teaching for Bayesian Learners in the Exponential Family , 2013, NIPS.

[15]  Sally A. Goldman,et al.  Teaching a Smarter Learner , 1996, J. Comput. Syst. Sci..

[16]  Dana Angluin,et al.  Learning from Different Teachers , 2004, Machine Learning.

[17]  M. Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[18]  J. Stigler,et al.  Teaching Is a Cultural Activity. , 1998 .

[19]  Shai Ben-David,et al.  Self-Directed Learning and Its Relation to the VC-Dimension and to Teacher-Directed Learning , 2004, Machine Learning.

[20]  Sandra Zilles,et al.  Models of Cooperative Teaching and Learning , 2011, J. Mach. Learn. Res..

[21]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[22]  Pierre-Yves Oudeyer,et al.  A Comparison of Automatic Teaching Strategies for Heterogeneous Student Populations , 2016, EDM.

[23]  Thomas Zeugmann,et al.  Teaching Randomized Learners , 2006, COLT.

[24]  Paul Barford,et al.  Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.

[25]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[26]  Paul Barford,et al.  Explicit Defense Actions Against Test-Set Attacks , 2017, AAAI.

[27]  Xiaojin Zhu,et al.  Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.

[28]  Hans Ulrich Simon,et al.  Recursive teaching dimension, VC-dimension and sample compression , 2014, J. Mach. Learn. Res..

[29]  Xiaojin Zhu,et al.  The Teaching Dimension of Linear Learners , 2015, ICML.

[30]  Ronald L. Rivest,et al.  Being taught can be faster than asking questions , 1995, COLT '95.

[31]  Frank J. Balbach,et al.  Measuring teachability using variants of the teaching dimension , 2008, Theor. Comput. Sci..

[32]  A. Thomaz,et al.  Mixed-Initiative Active Learning , 2012 .

[33]  M. Fischer,et al.  Proceedings of the tenth annual ACM symposium on Theory of computing , 1978 .