Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study

Artificial intelligence systems increasingly involve continual learning to enable flexibility in general situations that are not encoun-tered during system training. Human interaction with autonomous systems is broadly studied, but research has hitherto under-explored interactions that occur while the system is actively learning, and can noticeably change its behaviour in minutes. In this pilot study, we investigate how the interaction between a human and a continually learning prediction agent develops as the agent develops competency. Additionally, we compare two different agent architectures to assess how representational choices in agent design affect the human-agent interaction. We develop a virtual reality environment and a time-based prediction task wherein learned predictions from a reinforcement learning (RL) algorithm augment human predictions. We assess how a participant’s performance and behaviour in this task differs across agent types, using both quantitative and qualitative analyses. Our findings suggest that human trust of the system may be influenced by early interactions with the agent, and that trust in turn affects strategic behaviour, but limitations of the pilot study rule out any conclusive state-ment. We identify trust as a key feature of interaction to focus on when considering RL-based technologies, and make several recommendations for modification to this study in preparation for a larger-scale investigation. A video summary of this paper can be found at https://youtu.be/oVYJdnBqTwQ.

[1]  Adam S. R. Parker,et al.  The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents , 2022, ArXiv.

[2]  Elliot A. Ludvig,et al.  From eye-blinks to state construction: Diagnostic benchmarks for online representation learning , 2020, Adapt. Behav..

[3]  Jonathon W. Sensinger,et al.  Embodied Cooperation to Promote Forgiving Interactions With Autonomous Machines , 2021, Frontiers in Neurorobotics.

[4]  Angeliki Lazaridou,et al.  Emergent Multi-Agent Communication in the Deep Learning Era , 2020, ArXiv.

[5]  Richard S. Sutton,et al.  Pavlovian control of intraspinal microstimulation to produce over-ground walking , 2019, bioRxiv.

[6]  Patrick M. Pilarski,et al.  Exploring the Impact of Machine-Learned Predictions on Feedback from an Artificial Limb , 2019, 2019 IEEE 16th International Conference on Rehabilitation Robotics (ICORR).

[7]  Bo He,et al.  Human-Centered Reinforcement Learning: A Survey , 2019, IEEE Transactions on Human-Machine Systems.

[8]  Patrick M. Pilarski,et al.  Learned human-agent decision-making, communication and joint action in a virtual reality environment , 2019, ArXiv.

[9]  Dean V. Buonomano,et al.  The Neural Basis of Timing: Distributed Mechanisms for Diverse Functions , 2018, Neuron.

[10]  Iyad Rahwan,et al.  Cooperating with machines , 2017, Nature Communications.

[11]  Craig Sherstan,et al.  Application of real-time machine learning to myoelectric prosthesis control: A case series in adaptive switching , 2016, Prosthetics and orthotics international.

[12]  Evan F. Risko,et al.  Cognitive Offloading , 2016, Trends in Cognitive Sciences.

[13]  Patrick M. Pilarski,et al.  Machine learning and unlearning to autonomously switch between the functions of a myoelectric arm , 2016, 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob).

[14]  Craig Sherstan,et al.  Towards Prosthetic Arms as Wearable Intelligent Robots , 2015 .

[15]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[16]  T. Scott-Phillips Speaking Our Minds: Why human communication is different, and how language evolved to make it special , 2014 .

[17]  Jason P. Gallivan,et al.  Three-dimensional reach trajectories as a probe of real-time decision-making between multiple competing targets , 2014, Front. Neurosci..

[18]  Richard S. Sutton,et al.  Prediction Driven Behavior: Learning Predictions that Drive Fixed Responses , 2014, AAAI 2014.

[19]  Richard S. Sutton,et al.  Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..

[20]  Patrick M. Pilarski,et al.  Adaptive artificial limbs: a real-time approach to prediction and anticipation , 2013, IEEE Robotics & Automation Magazine.

[21]  Jessie Y. C. Chen,et al.  A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction , 2011, Hum. Factors.

[22]  Patrick M. Pilarski,et al.  Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.

[23]  G. Ritchie,et al.  Signalling signalhood and the emergence of communication , 2008, Cognition.

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  E. Kehoe,et al.  Fundamental Behavioral Methods and Findings in Classical Conditioning , 2002 .

[26]  Pattie Maes,et al.  Agents that reduce work and information overload , 1994, CACM.