Photonic architecture for reinforcement learning

The last decade has seen an unprecedented growth in artificial intelligence and photonic technologies, both of which drive the limits of modern-day computing devices. In line with these recent developments, this work brings together the state of the art of both fields within the framework of reinforcement learning. We present the blueprint for a photonic implementation of an active learning machine incorporating contemporary algorithms such as SARSA, Q-learning, and projective simulation. We numerically investigate its performance within typical reinforcement learning environments, showing that realistic levels of experimental noise can be tolerated or even be beneficial for the learning process. Remarkably, the architecture itself enables mechanisms of abstraction and generalization, two features which are often considered key ingredients for artificial intelligence. The proposed architecture, based on single-photon evolution on a mesh of tunable beamsplitters, is simple, scalable, and a first integration in portable systems appears to be within the reach of near-term technology.

[1]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[2]  A. Sutera,et al.  The mechanism of stochastic resonance , 1981 .

[3]  J.D. Meindl Ultra-large scale integration , 1984, IEEE Transactions on Electron Devices.

[4]  Yamamoto,et al.  Quantum nondemolition measurement of the photon number via the optical Kerr effect. , 1985, Physical review. A, General physics.

[5]  C. Watkins Learning from delayed rewards , 1989 .

[6]  Carver A. Mead,et al.  Neuromorphic electronic systems , 1990, Proc. IEEE.

[7]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[8]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[9]  Claudio Moraga,et al.  The Influence of the Sigmoid Function Parameters on the Speed of Backpropagation Learning , 1995, IWANN.

[10]  Tara Taylor,et al.  Computational topology and fractal trees , 2005 .

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  C. Roeloffzen,et al.  Single-Chip Ring Resonator-Based 1 $\times$ 8 Optical Beam Forming Network in CMOS-Compatible Waveguide Technology , 2007, IEEE Photonics Technology Letters.

[13]  Matthew E. Taylor,et al.  Abstraction and Generalization in Reinforcement Learning: A Summary and Framework , 2009, ALA.

[14]  Julian Togelius,et al.  Super mario evolution , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[15]  Derek Abbott,et al.  What Is Stochastic Resonance? Definitions, Misconceptions, Debates, and Its Relevance to Biology , 2009, PLoS Comput. Biol..

[16]  Matthew E. Taylor,et al.  Adaptive and Learning Agents, Second Workshop, ALA 2009, Held as Part of the AAMAS 2009 Conference in Budapest, Hungary, May 12, 2009, Revised Selected Papers , 2010, ALA.

[17]  Hans J. Briegel,et al.  Projective simulation for artificial intelligence , 2011, Scientific Reports.

[18]  K. Schwab The Fourth Industrial Revolution , 2013 .

[19]  David A. B. Miller,et al.  Self-configuring universal linear optical component [Invited] , 2013, 1303.4602.

[20]  Wei Chen,et al.  Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.

[21]  D. Miller,et al.  Self-aligning universal beam coupler. , 2013, Optics express.

[22]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[23]  Marc Sorel,et al.  Non-invasive monitoring and control in silicon photonics using CMOS integrated electronics , 2014, 1405.5794.

[24]  Anthony Laing,et al.  Direct dialling of Haar random unitary matrices , 2015, 1506.06220.

[25]  Rajeev J. Ram,et al.  Single-chip microprocessor that communicates directly using light , 2015, Nature.

[26]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[27]  Justus H. Piater,et al.  Robotic playing for hierarchical complex skill learning , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Kevin A. Williams,et al.  Integrated optical switch matrices for packet data networks , 2016, Microsystems & Nanoengineering.

[29]  Konstantinos Panagiotou,et al.  Efficient Sampling Methods for Discrete Distributions , 2012, Algorithmica.

[30]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[31]  Hans-J. Briegel,et al.  Meta-learning within Projective Simulation , 2016, IEEE Access.

[32]  Alan Y. Liu,et al.  Heterogeneous Silicon Photonic Integrated Circuits , 2016, Journal of Lightwave Technology.

[33]  Nicolò Spagnolo,et al.  Benchmarking integrated linear-optical architectures for quantum information processing , 2017, Scientific Reports.

[34]  Antonio Celani,et al.  Flow Navigation by Smart Microswimmers via Reinforcement Learning , 2017, Physical review letters.

[35]  Keren Bergman,et al.  Modular architecture for fully non-blocking silicon photonic switch fabric , 2017, Microsystems & Nanoengineering.

[36]  Dirk Englund,et al.  Deep learning with coherent nanophotonic circuits , 2017, 2017 Fifth Berkeley Symposium on Energy Efficient Electronic Systems & Steep Transistors Workshop (E3S).

[37]  Thomas Taubner,et al.  Phase-change materials for non-volatile photonic applications , 2017, Nature Photonics.

[38]  James C. Gates,et al.  Using an imperfect photonic network to implement random unitaries , 2017 .

[39]  Hans-J. Briegel,et al.  Projective simulation with generalization , 2015, Scientific Reports.

[40]  Rajeev J Ram,et al.  Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip , 2018, Nature.

[41]  Laura Mančinska,et al.  Multidimensional quantum entanglement with large-scale integrated optics , 2018, Science.

[42]  Hans J. Briegel,et al.  Benchmarking Projective Simulation in Navigation Problems , 2018, IEEE Access.

[43]  Ivana Gasulla,et al.  Programmable multifunctional integrated nanophotonics , 2018, Nanophotonics.

[44]  Kevin J. Miller,et al.  Optical phase change materials in integrated silicon photonic devices: review , 2018, Optical Materials Express.

[45]  Mario Krenn,et al.  Active learning machine learns to create new quantum experiments , 2017, Proceedings of the National Academy of Sciences.

[46]  Jianping Chen,et al.  Integrated optical delay lines: a review and perspective [Invited] , 2018 .

[47]  Gert Cauwenberghs,et al.  Large-Scale Neuromorphic Spiking Array Processors: A Quest to Mimic the Brain , 2018, Front. Neurosci..

[48]  Marin Bukov,et al.  Reinforcement learning for autonomous preparation of Floquet-engineered states: Inverting the quantum Kapitza oscillator , 2018, Physical Review B.

[49]  Christopher C. Tison,et al.  Linear programmable nanophotonic processors , 2018, Optica.

[50]  Pankaj Mehta,et al.  Reinforcement Learning in Different Phases of Quantum Control , 2017, Physical Review X.

[51]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[52]  Julian Togelius,et al.  AlphaStar: an evolutionary computation perspective , 2019, GECCO.

[53]  Xin Tu,et al.  State of the Art and Perspectives on Silicon Photonic Switches , 2019, Micromachines.

[54]  Paul R. Prucnal,et al.  Machine Learning With Neuromorphic Photonics , 2019, Journal of Lightwave Technology.

[55]  Yue Jiang,et al.  All-optical neural network with nonlinear activation functions , 2019, Optica.

[56]  Shanhui Fan,et al.  Training of Photonic Neural Networks through In Situ Backpropagation , 2018, 2019 Conference on Lasers and Electro-Optics (CLEO).

[57]  Dario Tamascelli,et al.  Coherent transport of quantum states by deep reinforcement learning , 2019, Communications Physics.

[58]  Gert Cauwenberghs,et al.  Large-Scale Neuromorphic Spiking Array Processors: A Quest to Mimic the Brain , 2018, Front. Neurosci..

[59]  Nicolai Friis,et al.  Optimizing Quantum Error Correction Codes with Reinforcement Learning , 2018, Quantum.

[60]  Dirk Englund,et al.  Quantum optical neural networks , 2018, npj Quantum Information.

[61]  Hon-Sum Philip Wong,et al.  Device and materials requirements for neuromorphic computing , 2019, Journal of Physics D: Applied Physics.

[62]  Katja Ried,et al.  Modelling collective motion based on the principle of agency: General framework and the case of marching locusts , 2017, PloS one.

[63]  C. David Wright,et al.  In-memory computing on a photonic platform , 2018, Science Advances.

[64]  N. Spagnolo,et al.  Photonic quantum information processing: a review , 2018, Reports on progress in physics. Physical Society.

[65]  Hartmut Neven,et al.  Universal quantum control through deep reinforcement learning , 2019 .

[66]  Jens Eisert,et al.  Reinforcement learning decoders for fault-tolerant quantum computation , 2018, Mach. Learn. Sci. Technol..