Optimal Quantum Sample Complexity of Learning Algorithms

$ \newcommand{\eps}{\varepsilon} $In learning theory, the VC dimension of a concept class $C$ is the most common way to measure its "richness." In the PAC model $$ \Theta\Big(\frac{d}{\eps} + \frac{\log(1/\delta)}{\eps}\Big) $$ examples are necessary and sufficient for a learner to output, with probability $1-\delta$, a hypothesis $h$ that is $\eps$-close to the target concept $c$. In the related agnostic model, where the samples need not come from a $c\in C$, we know that $$ \Theta\Big(\frac{d}{\eps^2} + \frac{\log(1/\delta)}{\eps^2}\Big) $$ examples are necessary and sufficient to output an hypothesis $h\in C$ whose error is at most $\eps$ worse than the best concept in $C$. Here we analyze quantum sample complexity, where each example is a coherent quantum state. This model was introduced by Bshouty and Jackson, who showed that quantum examples are more powerful than classical examples in some fixed-distribution settings. However, Atici and Servedio, improved by Zhang, showed that in the PAC setting, quantum examples cannot be much more powerful: the required number of quantum examples is $$ \Omega\Big(\frac{d^{1-\eta}}{\eps} + d + \frac{\log(1/\delta)}{\eps}\Big)\mbox{ for all }\eta> 0. $$ Our main result is that quantum and classical sample complexity are in fact equal up to constant factors in both the PAC and agnostic models. We give two approaches. The first is a fairly simple information-theoretic argument that yields the above two classical bounds and yields the same bounds for quantum sample complexity up to a $\log(d/\eps)$ factor. We then give a second approach that avoids the log-factor loss, based on analyzing the behavior of the "Pretty Good Measurement" on the quantum state identification problems that correspond to learning. This shows classical and quantum sample complexity are equal up to constant factors.

[1]  William K. Wootters,et al.  A ‘Pretty Good’ Measurement for Distinguishing Quantum States , 1994 .

[2]  Mario Baum An Introduction To Computational Learning Theory , 2016 .

[3]  Ronald de Wolf,et al.  Guest Column: A Survey of Quantum Learning Theory , 2017, SIGA.

[4]  Ashley Montanaro On the Distinguishability of Random Quantum States , 2007 .

[5]  Gilles Brassard,et al.  Machine Learning in a Quantum World , 2006, Canadian AI.

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[7]  Christino Tamon,et al.  Quantum DNF Learnability Revisited , 2002, COCOON.

[8]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[9]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[10]  Claudio Gentile,et al.  Sample Size Lower Bounds in PAC Learning by Algorithmic Complexity Theory , 1998, Theor. Comput. Sci..

[11]  Ashish Kapoor,et al.  Quantum deep learning , 2014, Quantum Inf. Comput..

[12]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[13]  Rocco A. Servedio,et al.  Equivalences and Separations Between Quantum and Classical Learnability , 2004, SIAM J. Comput..

[14]  Rahul Jain,et al.  New bounds on classical and quantum one-way communication complexity , 2008, Theor. Comput. Sci..

[15]  Dmitry Gavinsky Quantum predictive learning and communication complexity with single input , 2012, Quantum Inf. Comput..

[16]  Ronald de Wolf,et al.  A Survey of Quantum Learning Theory , 2017, ArXiv.

[17]  Jean-Yves Audibert Fast learning rates in statistical inference through aggregation , 2007, math/0703854.

[18]  Gilles Brassard,et al.  Quantum speed-up for unsupervised learning , 2012, Machine Learning.

[19]  Yonina C. Eldar,et al.  Designing optimal quantum detectors via semidefinite programming , 2003, IEEE Trans. Inf. Theory.

[20]  Rocco A. Servedio,et al.  Improved Bounds on Quantum Learning Algorithms , 2004, Quantum Inf. Process..

[21]  Karsten A. Verbeurgt Learning DNF under the uniform distribution in quasi-polynomial time , 1990, COLT '90.

[22]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[23]  Shun-ichi Amari,et al.  A Theory of Pattern Recognition , 1968 .

[24]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[25]  R. Cleve,et al.  Quantum fingerprinting. , 2001, Physical review letters.

[26]  Ashish Kapoor,et al.  Quantum Perceptron Models , 2016, NIPS.

[27]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[28]  Umesh V. Vazirani,et al.  Quantum complexity theory , 1993, STOC.

[29]  Nader H. Bshouty,et al.  Learning DNF over the uniform distribution using a quantum example oracle , 1995, COLT '95.

[30]  Claudio Gentile,et al.  Improved lower bounds for learning from noisy examples: an information-theoretic approach , 1998, COLT' 98.

[31]  Steve Hanneke,et al.  The Optimal Sample Complexity of PAC Learning , 2015, J. Mach. Learn. Res..

[32]  Dave Bacon,et al.  Optimal measurements for the dihedral hidden subgroup problem , 2005, Chic. J. Theor. Comput. Sci..

[33]  Rocco A. Servedio,et al.  Quantum Algorithms for Learning and Testing Juntas , 2007, Quantum Inf. Process..

[34]  Schumacher,et al.  Classical information capacity of a quantum channel. , 1996, Physical review. A, Atomic, molecular, and optical physics.

[35]  Ashley Montanaro,et al.  The quantum query complexity of learning multilinear polynomials , 2011, Inf. Process. Lett..

[36]  T. Sanders,et al.  Analysis of Boolean Functions , 2012, ArXiv.

[37]  Robin Kothari,et al.  An optimal quantum algorithm for the oracle identification problem , 2013, STACS.

[38]  Hans Ulrich Simon,et al.  General bounds on the number of examples needed for learning probabilistic concepts , 1993, COLT '93.

[39]  A. Harrow,et al.  Quantum algorithm for linear systems of equations. , 2008, Physical review letters.

[40]  E. Knill,et al.  Reversing quantum dynamics with near-optimal quantum and classical fidelity , 2000, quant-ph/0004088.

[41]  Scott Aaronson,et al.  Quantum Machine Learning Algorithms : Read the Fine Print , 2015 .

[42]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[43]  Chi Zhang,et al.  An improved lower bound on query complexity for quantum PAC learning , 2010, Inf. Process. Lett..

[44]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[45]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[46]  Yonina C. Eldar,et al.  On quantum detection and the square-root measurement , 2001, IEEE Trans. Inf. Theory.

[47]  Andris Ambainis,et al.  Quantum algorithms for search with wildcards and combinatorial group testing , 2012, Quantum Inf. Comput..

[48]  Jihun Park,et al.  The geometry of quantum learning , 2010, Quantum Inf. Process..

[49]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[50]  Amit Daniely,et al.  Complexity Theoretic Limitations on Learning DNF's , 2014, COLT.

[51]  Lov K. Grover A fast quantum mechanical algorithm for database search , 1996, STOC '96.

[52]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[53]  M. Talagrand Sharper Bounds for Gaussian and Empirical Processes , 1994 .

[54]  Raymond Laflamme,et al.  An Introduction to Quantum Computing , 2007, Quantum Inf. Comput..

[55]  Aryeh Kontorovich,et al.  Exact Lower Bounds for the Agnostic Probably-Approximately-Correct (PAC) Machine Learning Model , 2016, The Annals of Statistics.

[56]  Scott Aaronson,et al.  The learnability of quantum states , 2006, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[57]  Hans Ulrich Simon,et al.  An Almost Optimal PAC Algorithm , 2015, COLT.

[58]  Aram W. Harrow,et al.  Quantum algorithm for solving linear systems of equations , 2010 .