A new method of concurrently visualizing states, values, and actions in reinforcement based brain machine interfaces

This paper presents the first attempt to quantify the individual performance of the subject and of the computer agent on a closed loop Reinforcement Learning Brain Machine Interface (RLBMI). The distinctive feature of the RLBMI architecture is the co-adaptation of two systems (a BMI decoder in agent and a BMI user in environment). In this work, an agent implemented using Q-learning via kernel temporal difference (KTD)(λ) decodes the neural states of a monkey and transforms them into action directions of a robotic arm. We analyze how each participant influences the overall performance both in successful and missed trials by visualizing states, corresponding action value Q, and resulting actions in two-dimensional space. With the proposed methodology, we can observe how the decoder effectively learns a good state to action mapping, and how neural states affect the prediction performance.

[1]  José Carlos Príncipe,et al.  Reinforcement learning via kernel temporal difference , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[2]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[3]  Justin C. Sanchez,et al.  Brain-machine interface control of a robot arm using actor-critic rainforcement learning , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[4]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[5]  José Carlos Príncipe,et al.  2011 Ieee International Workshop on Machine Learning for Signal Processing Stochastic Kernel Temporal Difference for Reinforcement Learning , 2022 .

[6]  José Carlos Príncipe,et al.  Coadaptive Brain–Machine Interface via Reinforcement Learning , 2009, IEEE Transactions on Biomedical Engineering.

[7]  Justin C. Sanchez,et al.  Integrating robotic action with biologic perception: a brain-machine symbiosis theory , 2010 .