Quantum Advantage Actor-Critic for Reinforcement Learning

Quantum computing offers efficient encapsulation of high-dimensional states. In this work, we propose a novel quantum reinforcement learning approach that combines the Advantage Actor-Critic algorithm with variational quantum circuits by substituting parts of the classical components. This approach addresses reinforcement learning's scalability concerns while maintaining high performance. We empirically test multiple quantum Advantage Actor-Critic configurations with the well known Cart Pole environment to evaluate our approach in control tasks with continuous state spaces. Our results indicate that the hybrid strategy of using either a quantum actor or quantum critic with classical post-processing yields a substantial performance increase compared to pure classical and pure quantum variants with similar parameter counts. They further reveal the limits of current quantum approaches due to the hardware constraints of noisy intermediate-scale quantum computers, suggesting further research to scale hybrid approaches for larger and more complex control tasks.

[1]  Samuel Y. Chen Asynchronous training of quantum reinforcement learning , 2023, INNS DLIA@IJCNN.

[2]  C. Linnhoff-Popien,et al.  Improving Convergence for Quantum Variational Classifiers using Weight Re-Mapping , 2022, ICAART.

[3]  M. Hartmann,et al.  Quantum Policy Gradient Algorithm with Optimized Action Decoding , 2022, ICML.

[4]  H. Goan,et al.  Unentangled quantum reinforcement learning agents in the OpenAI Gym , 2022, 2203.14348.

[5]  André Sequeira,et al.  Policy gradients using variational quantum circuits , 2022, Quantum Machine Intelligence.

[6]  F. Kirchner,et al.  Quantum Deep Reinforcement Learning for Robot Navigation Tasks , 2022, ArXiv.

[7]  Qingfeng Lan,et al.  Variational Quantum Soft Actor-Critic , 2021, ArXiv.

[8]  Marcel Worring,et al.  The Dawn of Quantum Natural Language Processing , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Ying-Jer Kao,et al.  Variational quantum reinforcement learning via evolutionary optimization , 2021, Mach. Learn. Sci. Technol..

[10]  Joongheon Kim,et al.  Introduction to Quantum Reinforcement Learning: Theory and PennyLane-based Implementation , 2021, 2021 International Conference on Information and Communication Technology Convergence (ICTC).

[11]  Sofiène Jerbi,et al.  Quantum agents in the Gym: a variational quantum algorithm for deep Q-learning , 2021, Quantum.

[12]  Vedran Dunjko,et al.  Parametrized Quantum Policies for Reinforcement Learning , 2021, NeurIPS.

[13]  Ying-Jer Kao,et al.  An end-to-end trainable hybrid classical-quantum classifier , 2021, Mach. Learn. Sci. Technol..

[14]  M. Cerezo,et al.  Variational quantum algorithms , 2020, Nature Reviews Physics.

[15]  Owen Lockwood,et al.  Reinforcement Learning with Quantum Variational Circuits , 2020, AAAI 2020.

[16]  Alec Koppel,et al.  Variational Policy Gradient Method for Reinforcement Learning with General Utilities , 2020, NeurIPS.

[17]  Pavlo O. Dral,et al.  Quantum Chemistry in the Age of Machine Learning. , 2020, The journal of physical chemistry letters.

[18]  Nathan Killoran,et al.  Transfer learning in hybrid classical-quantum neural networks , 2019, Quantum.

[19]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[20]  Jos'e I. Latorre,et al.  Data re-uploading for a universal quantum classifier , 2019, Quantum.

[21]  Chao-Han Huck Yang,et al.  Variational Quantum Circuits for Deep Reinforcement Learning , 2019, IEEE Access.

[22]  J. S. Shaari,et al.  Advances in Quantum Cryptography , 2019, 1906.01645.

[23]  Nicolas P. D. Sawaya,et al.  Quantum Chemistry in the Age of Quantum Computing. , 2018, Chemical reviews.

[24]  Nathan Killoran,et al.  PennyLane: Automatic differentiation of hybrid quantum-classical computations , 2018, ArXiv.

[25]  Maria Schuld,et al.  Supervised Learning with Quantum Computers , 2018 .

[26]  Ryan Babbush,et al.  Barren plateaus in quantum neural network training landscapes , 2018, Nature Communications.

[27]  Abien Fred Agarap Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.

[28]  Rupak Biswas,et al.  Quantum Machine Learning , 2018 .

[29]  John Preskill,et al.  Quantum Computing in the NISQ era and beyond , 2018, Quantum.

[30]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[31]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[32]  Cewu Lu,et al.  Virtual to Real Reinforcement Learning for Autonomous Driving , 2017, BMVC.

[33]  A. Harrow,et al.  Quantum Supremacy through the Quantum Approximate Optimization Algorithm , 2016, 1602.07674.

[34]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[35]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[36]  Marc G. Bellemare,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  E. Farhi,et al.  A Quantum Approximate Optimization Algorithm , 2014, 1411.4028.

[39]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[40]  Isaac L. Chuang,et al.  Quantum Computation and Quantum Information (10th Anniversary edition) , 2011 .

[41]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[42]  Peter W. Shor,et al.  Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer , 1995, SIAM Rev..

[43]  Matthieu Geist,et al.  What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study , 2021, ICLR.

[44]  Matthias Homeister,et al.  Quantum Computing verstehen - Grundlagen, Anwendungen, Perspektiven , 2005, Computational intelligence.

[45]  Vijay R. Konda,et al.  Actor-Critic Algorithms , 1999, NIPS.