Quantum machine learning with glow for episodic tasks and decision games

We consider a general class of models, where a reinforcement learning (RL) agent learns from cyclic interactions with an external environment via classical signals. Perceptual inputs are encoded as quantum states, which are subsequently transformed by a quantum channel representing the agent's memory, while the outcomes of measurements performed at the channel's output determine the agent's actions. The learning takes place via stepwise modifications of the channel properties. They are described by an update rule that is inspired by the projective simulation (PS) model and equipped with a glow mechanism that allows for a backpropagation of policy changes, analogous to the eligibility traces in RL and edge glow in PS. In this way, the model combines features of PS with the ability for generalization, offered by its physical embodiment as a quantum system. We apply the agent to various setups of an invasion game and a grid world, which serve as elementary model tasks allowing a direct comparison with a basic classical PS agent.

[1]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[2]  Jacob biamonte,et al.  Quantum machine learning , 2016, Nature.

[3]  Klaus Molmer,et al.  Fidelity of quantum operations , 2007 .

[4]  Ievgeniia Oshurko Quantum Machine Learning , 2020, Quantum Computing.

[5]  F. Petruccione,et al.  An introduction to quantum machine learning , 2014, Contemporary Physics.

[6]  Maria Schuld,et al.  Quantum walks on graphs representing the firing patterns of a quantum neural network , 2014, Physical Review A.

[7]  Hans J. Briegel,et al.  Projective simulation for artificial intelligence , 2011, Scientific Reports.

[8]  Susan Stepney,et al.  Zermelo Navigation and a Speed Limit to Quantum Information Processing , 2013, ArXiv.

[9]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[10]  Christiane P. Koch,et al.  Training Schrödinger’s cat: quantum optimal control , 2015, 1508.00442.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  G. Harel,et al.  Complete Control of Hamiltonian Quantum Systems: Engineering of Floquet Evolution , 1999 .

[13]  Hans-J. Briegel,et al.  Projective Simulation for Classical Learning Agents: A Comprehensive Investigation , 2015, New Generation Computing.

[14]  Jeongho Bang,et al.  A strategy for quantum algorithm design assisted by machine learning , 2013, 1301.1132.

[15]  Ian R. Petersen,et al.  Quantum control theory and applications: A survey , 2009, IET Control Theory & Applications.

[16]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[17]  H. Rabitz,et al.  Landscape for optimal control of quantum-mechanical unitary transformations , 2005 .

[18]  Gershon Kurizki,et al.  Bath-optimized minimal-energy protection of quantum operations from decoherence. , 2010, Physical review letters.

[19]  Christoph Dankert Efficient Simulation of Random Quantum States and Operators , 2005 .

[20]  Nicolai Friis,et al.  Quantum-enhanced deliberation of learning agents using trapped ions , 2014, 1407.2830.

[21]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[22]  J. Clausen Time-optimal bath-induced unitaries by Zermelo navigation: speed limit and non-Markovian quantum computation , 2015, 1507.08990.

[23]  Maria Schuld,et al.  The quest for a Quantum Neural Network , 2014, Quantum Information Processing.

[24]  Thierry Paul,et al.  Quantum computation and quantum information , 2007, Mathematical Structures in Computer Science.

[25]  Claus Kiefer,et al.  Quantum Measurement and Control , 2010 .

[26]  D. D’Alessandro Introduction to Quantum Control and Dynamics , 2007 .

[27]  R. Xu,et al.  Theory of open quantum systems , 2002 .

[28]  Masoud Mohseni,et al.  Quantum brachistochrone curves as geodesics: obtaining accurate minimum-time protocols for the control of quantum systems. , 2014, Physical review letters.

[29]  Tzyh Jong Tarn,et al.  Quantum Reinforcement Learning , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[30]  Vedran Dunjko,et al.  Quantum speedup for active learning agents , 2014, 1401.4997.

[31]  Raúl Rojas,et al.  Neural Networks - A Systematic Introduction , 1996 .

[32]  Hans-J. Briegel,et al.  Projective simulation with generalization , 2015, Scientific Reports.

[33]  Lloyd,et al.  Almost any quantum logic gate is universal. , 1995, Physical review letters.

[34]  Christiane P. Koch,et al.  Hybrid optimization schemes for quantum control , 2015, 1505.05331.