Efficiency of quantum vs. classical annealing in nonconvex learning problems

Significance Quantum annealers are physical quantum devices designed to solve optimization problems by finding low-energy configurations of an appropriate energy function by exploiting cooperative tunneling effects to escape local minima. Classical annealers use thermal fluctuations for the same computational purpose, and Markov chains based on this principle are among the most widespread optimization techniques. The fundamental mechanism underlying quantum annealing consists of exploiting a controllable quantum perturbation to generate tunneling processes. The computational potentialities of quantum annealers are still under debate, since few ad hoc positive results are known. Here, we identify a wide class of large-scale nonconvex optimization problems for which quantum annealing is efficient while classical annealing gets stuck. These problems are of central interest to machine learning. Quantum annealers aim at solving nonconvex optimization problems by exploiting cooperative tunneling effects to escape local minima. The underlying idea consists of designing a classical energy function whose ground states are the sought optimal solutions of the original optimization problem and add a controllable quantum transverse field to generate tunneling processes. A key challenge is to identify classes of nonconvex optimization problems for which quantum annealing remains efficient while thermal annealing fails. We show that this happens for a wide class of problems which are central to machine learning. Their energy landscapes are dominated by local minima that cause exponential slowdown of classical thermal annealers while simulated quantum annealing converges efficiently to rare dense regions of optimal solutions.

[1]  J. Doll,et al.  Quantum annealing: A new method for minimizing multidimensional functions , 1994, chem-ph/9404003.

[2]  V. Fock,et al.  Beweis des Adiabatensatzes , 1928 .

[3]  M. W. Johnson,et al.  Quantum annealing with manufactured spins , 2011, Nature.

[4]  W. Krauth,et al.  Storage capacity of memory networks with binary couplings , 1989 .

[5]  Carlo Baldassi,et al.  A Max-Sum algorithm for training discrete neural networks , 2015, ArXiv.

[6]  Yoshiyuki Kabashima,et al.  Origin of the computational hardness for learning with binary synapses , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Carlo Baldassi Generalization Learning in a Perceptron with Binary Synapses , 2009, 1211.3024.

[8]  M. Troyer,et al.  Quantum versus classical annealing of Ising spin glasses , 2014, Science.

[9]  H. Nishimori,et al.  Quantum annealing in the transverse Ising model , 1998, cond-mat/9804280.

[10]  Francesco Zamponi,et al.  A Tentative Replica Theory of Glassy Helium 4 , 2011, 1107.2758.

[11]  F. Zamponi,et al.  Solvable model of quantum random optimization problems. , 2010, Physical review letters.

[12]  Riccardo Zecchina,et al.  Learning by message-passing in networks of discrete synapses , 2005, Physical review letters.

[13]  B. Chakrabarti,et al.  Colloquium : Quantum annealing and analog quantum computation , 2008, 0801.2193.

[14]  Dirk Reuter,et al.  Control of fine-structure splitting and biexciton binding in In x Ga 1 − x As quantum dots by annealing , 2004 .

[15]  Christian Borgs,et al.  Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes , 2016, Proceedings of the National Academy of Sciences.

[16]  Guilhem Semerjian,et al.  The Quantum Adiabatic Algorithm applied to random optimization problems: the quantum spin glass perspective , 2012, ArXiv.

[17]  E. Farhi,et al.  A Quantum Adiabatic Evolution Algorithm Applied to Random Instances of an NP-Complete Problem , 2001, Science.

[18]  Erio Tosatti,et al.  Quantum annealing by the path-integral Monte Carlo method: The two-dimensional random Ising model , 2002 .

[19]  Steve Mullett,et al.  Read the fine print. , 2009, RN.

[20]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[21]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[22]  Carlo Baldassi,et al.  Learning may need only a few bits of synaptic precision. , 2016, Physical review. E.

[23]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[24]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[25]  R. Car,et al.  Theory of Quantum Annealing of an Ising Spin Glass , 2002, Science.

[26]  F. Barahona On the computational complexity of Ising spin glass models , 1982 .

[27]  Guilhem Semerjian,et al.  Thermal, quantum and simulated quantum annealing: analytical comparisons for simple models , 2013, 1512.07819.

[28]  Ray,et al.  Sherrington-Kirkpatrick model in a transverse field: Absence of replica symmetry breaking due to quantum fluctuations. , 1989, Physical review. B, Condensed matter.

[29]  Barry I. Schneider,et al.  Time Propagation of Partial Differential Equations Using the Short Iterative Lanczos Method and Finite-Element Discrete Variable Representation: An Experiment Using the Intel Phi Coprocessors: Extended Abstract , 2016, Extreme Science and Engineering Discovery Environment.

[30]  Daniel A. Lidar,et al.  Defining and detecting quantum speedup , 2014, Science.

[31]  H. Horner Dynamics of learning for the binary perceptron problem , 1992 .

[32]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[33]  Sompolinsky,et al.  Learning from examples in large neural networks. , 1990, Physical review letters.

[34]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[35]  Carlo Baldassi,et al.  Local entropy as a measure for sampling solutions in Constraint Satisfaction Problems , 2015 .

[36]  V. Bapst,et al.  On quantum mean-field models and their quantum annealing , 2012, 1203.6003.

[37]  David R. Reichman,et al.  Quantum fluctuations can promote or inhibit glass formation , 2010, 1011.0015.

[38]  Jérémie Roland,et al.  Anderson localization makes adiabatic quantum optimization fail , 2009, Proceedings of the National Academy of Sciences.

[39]  Carlo Baldassi A method to reduce the rejection rate in Monte Carlo Markov Chains on Ising spin models , 2016 .

[40]  Carlo Baldassi,et al.  Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses. , 2015, Physical review letters.

[41]  C. Zener Non-Adiabatic Crossing of Energy Levels , 1932 .

[42]  Cristopher Moore,et al.  The Nature of Computation , 2011 .

[43]  Daniel A. Lidar,et al.  Evidence for quantum annealing with more than one hundred qubits , 2013, Nature Physics.