Bandwidth Enables Generalization in Quantum Kernel Models

Quantum computers are known to provide speedups over classical state-of-the-art machine learning methods in some specialized settings. For example, quantum kernel methods have been shown to provide an exponential speedup on a learning version of the discrete logarithm problem. Understanding the generalization of quantum models is essential to realizing similar speedups on problems of practical interest. Recent results demonstrate that generalization is hindered by the exponential size of the quantum feature space. Although these results suggest that quantum models cannot generalize when the number of qubits is large, in this paper we show that these results rely on overly restrictive assumptions. We consider a wider class of models by varying a hyperparameter that we call quantum kernel bandwidth. We analyze the large-qubit limit and provide explicit formulas for the generalization of a quantum model that can be solved in closed form. Specifically, we show that changing the value of the bandwidth can take a model from provably not being able to generalize to any target function to good generalization for well-aligned targets. Our analysis shows how the bandwidth controls the spectrum of the kernel integral operator and thereby the inductive bias of the model. We demonstrate empirically that our theory correctly predicts how varying the bandwidth affects generalization of quantum models on challenging datasets, including those far outside our theoretical assumptions. We discuss the implications of our results for quantum advantage in machine learning.

[1]  M. Cerezo,et al.  Exponential concentration and untrainability in quantum kernel methods , 2022, ArXiv.

[2]  C. Ciuti,et al.  Noisy quantum kernel machines , 2022, Physical Review A.

[3]  Stefan M. Wild,et al.  Importance of Kernel Bandwidth in Quantum Machine Learning , 2021, Physical Review A.

[4]  Bernhard Scholkopf,et al.  The Inductive Bias of Quantum Kernels , 2021, NeurIPS.

[5]  Tanvi P. Gujarati,et al.  Covariant quantum kernels for data with group structure , 2021, Nature Physics.

[6]  Johannes Jakob Meyer,et al.  Training Quantum Embedding Kernels on Near-Term Quantum Computers , 2021, Physical Review A.

[7]  Jennifer R. Glick,et al.  Application of quantum machine learning using the quantum kernel algorithm on high energy physics analysis at the LHC , 2021, Physical Review Research.

[8]  Yong Luo,et al.  Towards understanding the power of quantum kernels in the NISQ era , 2021, Quantum.

[9]  Maria Schuld,et al.  Quantum machine learning models are kernel methods , 2021, 2101.11020.

[10]  H. Neven,et al.  Machine learning of high dimensional data on a noisy quantum processor , 2021, npj Quantum Information.

[11]  Patrick J. Coles,et al.  Connecting ansatz expressibility to gradient magnitudes and barren plateaus , 2021, PRX Quantum.

[12]  H. Neven,et al.  Power of data in quantum machine learning , 2020, Nature Communications.

[13]  K. Temme,et al.  A rigorous and robust quantum speed-up in supervised machine learning , 2020, Nature Physics.

[14]  C. Pehlevan,et al.  Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks , 2020, Nature Communications.

[15]  Blake Bordelon,et al.  Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks , 2020, ICML.

[16]  Tengyuan Liang,et al.  On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels , 2019, COLT.

[17]  Shouvanik Chakrabarti,et al.  Sublinear quantum algorithms for training linear and kernel-based classifiers , 2019, ICML.

[18]  Alex Lamb,et al.  Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[19]  Gautham Narayan,et al.  The Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC): Data set , 2018, 1810.00001.

[20]  Kristan Temme,et al.  Supervised learning with quantum-enhanced feature spaces , 2018, Nature.

[21]  John Watrous,et al.  The Theory of Quantum Information , 2018 .

[22]  Ryan Babbush,et al.  Barren plateaus in quantum neural network training landscapes , 2018, Nature Communications.

[23]  Maria Schuld,et al.  Quantum Machine Learning in Feature Hilbert Spaces. , 2018, Physical review letters.

[24]  Keisuke Fujii,et al.  Quantum circuit learning , 2018, Physical Review A.

[25]  Hartmut Neven,et al.  Classification with Quantum Neural Networks on Near Term Processors , 2018, 1802.06002.

[26]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[27]  Ion Nechita,et al.  Random matrix techniques in quantum information theory , 2015, 1509.04689.

[28]  S. Ganguli,et al.  Statistical mechanics of complex neural systems and high dimensional data , 2013, 1301.7115.

[29]  Yazhen Wang,et al.  Quantum Computation and Quantum Information , 2012, 1210.0736.

[30]  Jaroslaw Adam Miszczak,et al.  Symbolic integration with respect to the Haar measure on the unitary groups , 2011, 1109.4244.

[31]  Mikhail Belkin,et al.  On Learning with Integral Operators , 2010, J. Mach. Learn. Res..

[32]  M. Bremner,et al.  Temporally unstructured quantum computation , 2009, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[33]  Nello Cristianini,et al.  On the eigenspectrum of the gram matrix and the generalization error of kernel-PCA , 2005, IEEE Transactions on Information Theory.

[34]  M. Opper,et al.  Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.

[35]  Sompolinsky,et al.  Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[36]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[37]  Liu Liu,et al.  Gaussian initializations help deep variational quantum circuits escape from the barren plateau , 2022, ArXiv.

[38]  Beyond—bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[39]  Yoshua Bengio,et al.  The Curse of Dimensionality for Local Kernel Machines , 2005 .

[40]  S. Kak Information, physics, and computation , 1996 .