Universal quantum control through deep reinforcement learning

Emerging reinforcement learning techniques using deep neural networks have shown great promise in control optimization. They harness non-local regularities of noisy control trajectories and facilitate transfer learning between tasks. To leverage these powerful capabilities for quantum control optimization, we propose a new control framework to simultaneously optimize the speed and fidelity of quantum computation against both leakage and stochastic control errors. For a broad family of two-qubit unitary gates that are important for quantum simulation of many-electron systems, we improve the control robustness by adding control noise into training environments for reinforcement learning agents trained with trusted-region-policy-optimization. The agent control solutions demonstrate a two-order-of-magnitude reduction in average-gate-error over baseline stochastic-gradient-descent solutions and up to a one-order-of-magnitude reduction in gate time from optimal gate synthesis counterparts. These significant improvements in both fidelity and runtime are achieved by combining new physical understandings and state-of-the-art machine learning techniques. Our results open a venue for wider applications in quantum simulation, quantum chemistry and quantum supremacy tests using near-term quantum devices.

[1]  Ian R. Petersen,et al.  Quantum control theory and applications: A survey , 2009, IET Control Theory & Applications.

[2]  Joseph Emerson,et al.  Robust characterization of leakage errors , 2016 .

[3]  Kevin J. Sung,et al.  Quantum algorithms to simulate many-body physics of correlated fermions. , 2017, 1711.05395.

[4]  Richard D. Braatz,et al.  Open-loop and closed-loop robust optimal control of batch processes using distributional and worst-case analysis , 2004 .

[5]  Lov K. Grover Quantum Mechanics Helps in Searching for a Needle in a Haystack , 1997, quant-ph/9706033.

[6]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[7]  Pankaj Mehta,et al.  Glassy Phase of Optimal Quantum Control. , 2018, Physical review letters.

[8]  Constantin Brif,et al.  Robust control of quantum gates via sequential convex programming , 2013, ArXiv.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  R. Barends,et al.  Superconducting quantum circuits at the surface code threshold for fault tolerance , 2014, Nature.

[11]  Joel J. Wallman,et al.  Bounding quantum gate error rate based on reported average fidelity , 2015, 1501.04932.

[12]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[13]  V. Bergholm,et al.  Optimal control of coupled Josephson qubits , 2005, quant-ph/0504202.

[14]  J. Martinis,et al.  Fast adiabatic qubit gates using only σ z control , 2014, 1402.5467.

[15]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[16]  Matthew D. Grace,et al.  Characterization of control noise effects in optimal quantum unitary dynamics , 2014, 1405.5950.

[17]  Katharine W. Moore,et al.  On the relationship between quantum control landscape structure and optimization complexity. , 2008, The Journal of chemical physics.

[18]  H. Neven,et al.  Fluctuations of Energy-Relaxation Times in Superconducting Qubits. , 2018, Physical review letters.

[19]  Yshai Avishai,et al.  Nonlinear response of a Kondo system: Perturbation approach to the time-dependent Anderson impurity model , 1999, cond-mat/9912070.

[20]  M. Hastings,et al.  Progress towards practical quantum variational algorithms , 2015, 1507.08969.

[21]  J. M. Gambetta,et al.  Analytic control methods for high-fidelity unitary operations in a weakly nonlinear oscillator , 2010, 1011.1949.

[22]  Joseph Emerson,et al.  Scalable and robust randomized benchmarking of quantum processes. , 2010, Physical review letters.

[23]  John M. Martinis,et al.  Logic gates at the surface code threshold: Superconducting qubits poised for fault-tolerant quantum computing , 2014 .

[24]  Barry C. Sanders,et al.  Designing High-Fidelity Single-Shot Three-Qubit Gates: A Machine Learning Approach , 2015, ArXiv.

[25]  Jay M. Gambetta,et al.  Quantification and characterization of leakage errors , 2017, 1704.03081.

[26]  D. Alonso,et al.  Optimally robust shortcuts to population inversion in two-level quantum systems , 2012, 1206.1691.

[27]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[28]  Timo O. Reiss,et al.  Optimal control of coupled spin dynamics: design of NMR pulse sequences by gradient ascent algorithms. , 2005, Journal of magnetic resonance.

[29]  J M Gambetta,et al.  Simple pulses for elimination of leakage in weakly nonlinear qubits. , 2009, Physical review letters.

[30]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[31]  M. Nielsen A simple formula for the average gate fidelity of a quantum dynamical operation [rapid communication] , 2002, quant-ph/0205035.

[32]  Austin G. Fowler,et al.  Leakage-resilient approach to fault-tolerant quantum computing with superconducting elements , 2014, 1406.2404.

[33]  Daoyi Dong,et al.  Robust Learning Control Design for Quantum Unitary Transformations , 2017, IEEE Transactions on Cybernetics.

[34]  D. Tannor,et al.  Tunable, Flexible, and Efficient Optimization of Control Pulses for Practical Qubits. , 2018, Physical review letters.

[35]  H. Neven,et al.  Characterizing quantum supremacy in near-term devices , 2016, Nature Physics.

[36]  R. Feynman Simulating physics with computers , 1999 .

[37]  H Neven,et al.  A blueprint for demonstrating quantum supremacy with superconducting qubits , 2017, Science.

[38]  David Zueco,et al.  Qubit-oscillator dynamics in the dispersive regime: Analytical theory beyond the rotating-wave approximation , 2009, 0907.3516.

[39]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[40]  Pankaj Mehta,et al.  Reinforcement Learning in Different Phases of Quantum Control , 2017, Physical Review X.

[41]  A N Cleland,et al.  Qubit Architecture with High Coherence and Fast Tunable Coupling. , 2014, Physical review letters.

[42]  Florian Marquardt,et al.  Reinforcement Learning with Neural Networks for Quantum Feedback , 2018, Physical Review X.

[43]  Alán Aspuru-Guzik,et al.  Quantum Simulation of Electronic Structure with Linear Depth and Connectivity. , 2017, Physical review letters.

[44]  Barry C Sanders,et al.  High-Fidelity Single-Shot Toffoli Gate via Quantum Control. , 2015, Physical review letters.

[45]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[46]  Ian R. Petersen,et al.  Learning robust pulses for generating universal quantum gates , 2016, Scientific Reports.

[47]  Tzyh Jong Tarn,et al.  Fidelity-Based Probabilistic Q-Learning for Control of Quantum Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Robert F. Stengel,et al.  Optimal Control and Estimation , 1994 .

[49]  Barry C. Sanders,et al.  Learning in quantum control: High-dimensional global optimization for noisy quantum dynamics , 2016, Neurocomputing.

[50]  F. Jin,et al.  Gate-error analysis in simulations of quantum computers with transmon qubits , 2017, 1709.06600.

[51]  Tommaso Calarco,et al.  Robust optimal quantum gates for Josephson charge qubits. , 2007, Physical review letters.

[52]  Colin P. Williams,et al.  Optimal quantum circuits for general two-qubit gates (5 pages) , 2003, quant-ph/0308006.

[53]  Hsi-Sheng Goan,et al.  Robust quantum gates for stochastic time-varying noise , 2017, 1705.06150.

[54]  Herschel Rabitz,et al.  Quantum control landscapes , 2007, 0710.0684.