Scalable photonic reinforcement learning by time-division multiplexing of laser chaos

Reinforcement learning involves decision-making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to efficiently solve the two-armed bandit problem, which requires decision-making concerning a class of difficult trade-offs called the exploration–exploitation dilemma. However, only two selections were employed in that research; hence, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented where laser chaos time series significantly outperforms quasiperiodic signals, computer-generated pseudorandom numbers, and coloured noise. Detailed analyses are also provided that include performance comparisons among laser chaos signals generated in different physical conditions, which coincide with the diffusivity inherent in the time series. This study paves the way for ultrafast reinforcement learning by taking advantage of the ultrahigh bandwidths of light wave and practical enabling technologies.

[1]  Masatoshi Ishikawa,et al.  Optically Interconnected Parallel Computing Systems , 1998, Computer.

[2]  M. Naruse,et al.  Information physics fundamentals of nanophotonics , 2013, Reports on progress in physics. Physical Society.

[3]  Baruch Awerbuch,et al.  Online linear optimization and adaptive routing , 2008, J. Comput. Syst. Sci..

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Junji Ohtsubo,et al.  Semiconductor Lasers : Stability , Instability and Chaos , 2013 .

[6]  Motoichi Ohtsu,et al.  Decision Maker based on Nanoscale Photo-excitation Transfer , 2013, Scientific reports.

[7]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[8]  Atsushi Uchida,et al.  Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics and Synchronization , 1994 .

[9]  Кпсс,et al.  Первая конференция военных и боевых организаций РСДРП. Ноябрь 1906 год , 1932 .

[10]  Daniel Brunner,et al.  Parallel photonic information processing at gigabyte per second data rates using transient states , 2013, Nature Communications.

[11]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[12]  Makoto Naruse,et al.  Single Photon in Hierarchical Architecture for Physical Decision Making: Photon Intelligence , 2016 .

[13]  A. Uchida,et al.  Fast physical random bit generation with chaotic semiconductor lasers , 2008 .

[14]  Song-Ju Kim,et al.  Single-photon decision maker , 2015, Scientific Reports.

[15]  Ken-ichi Kawarabayashi,et al.  A coherent Ising machine for 2000-node optimization problems , 2016, Science.

[16]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[17]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[18]  Roy,et al.  Fast, accurate algorithm for numerical simulation of exponentially correlated colored noise. , 1988, Physical review. A, General physics.

[19]  Motoichi Ohtsu,et al.  Decision making based on optical excitation transfer via near-field interactions between quantum dots , 2014 .

[20]  内田 淳史 Optical communication with chaotic lasers : applications of nonlinear dynamics and synchronization , 2012 .

[21]  Song-Ju Kim,et al.  Ultrafast photonic reinforcement learning based on laser chaos , 2017, Scientific Reports.

[22]  S. Polyakov,et al.  : Single-photon sources and detectors , 2011 .

[23]  M. C. Soriano,et al.  Complex photonics: Dynamics and applications of delay-coupled semiconductors lasers , 2013 .

[24]  L Pesquera,et al.  Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing. , 2012, Optics express.

[25]  Mikio Hasegawa,et al.  Improving throughput using multi-armed bandit algorithm for wireless LANs , 2018 .

[26]  Gadi Eisenstein,et al.  Optical time-division multiplexing for very high bit-rate transmission , 1988 .

[27]  Takuma Akimoto,et al.  Anomalous diffusion in a quenched-trap model on fractal lattices. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Makoto Naruse,et al.  Random walk with chaotically driven bias , 2016, Scientific Reports.

[29]  Song-Ju Kim,et al.  Memory Effect on Adaptive Decision Making with a Chaotic Semiconductor Laser , 2018, Complex..

[30]  M Ishikawa,et al.  Analysis and characterization of alignment for free-space optical interconnects based on singular-value decomposition. , 2000, Applied optics.

[31]  J Fan,et al.  Invited review article: Single-photon sources and detectors. , 2011, The Review of scientific instruments.

[32]  S. Deligiannidis,et al.  Implementation of 140 Gb/s true random bit generator based on a chaotic photonic integrated circuit. , 2010, Optics express.

[33]  Oliver Kroemer,et al.  Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..