Experiments with sensorimotor games in dynamic human/machine interaction

While interacting with a machine, humans will naturally formulate beliefs about the machine's behavior, and these beliefs will affect the interaction. Since humans and machines have imperfect information about each other and their environment, a natural model for their interaction is a game. Such games have been investigated from the perspective of economic game theory, and some results on discrete decision-making have been translated to the neuromechanical setting, but there is little work on continuous sensorimotor games that arise when humans interact in a dynamic closed loop with machines. We study these games both theoretically and experimentally, deriving predictive models for steady-state (i.e. equilibrium) and transient (i.e. learning) behaviors of humans interacting with other agents (humans and machines). Specifically, we consider experiments wherein agents are instructed to control a linear system so as to minimize a given quadratic cost functional, i.e. the agents play a Linear-Quadratic game. Using our recent results on gradient-based learning in continuous games, we derive predictions regarding steady-state and transient play. These predictions are compared with empirical observations of human sensorimotor learning using a teleoperation testbed.

[1]  Ezra S. Krendel,et al.  The human operator as a servo system element , 1959 .

[2]  Mitsuo Kawato,et al.  Internal models for motor control and trajectory planning , 1999, Current Opinion in Neurobiology.

[3]  R. Ivry,et al.  The coordination of movement: optimal feedback control and beyond , 2010, Trends in Cognitive Sciences.

[4]  T. Michael Seigler,et al.  The Roles of Feedback and Feedforward as Humans Learn to Control Unknown Dynamic Systems , 2018, IEEE Transactions on Cybernetics.

[5]  Duane T. McRuer,et al.  A Review of Quasi-Linear Pilot Models , 1967 .

[6]  H. Simon,et al.  Models of Bounded Rationality: Empirically Grounded Economic Reason , 1997 .

[7]  S. Shankar Sastry,et al.  On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[8]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[9]  T. Başar,et al.  Dynamic Noncooperative Game Theory, 2nd Edition , 1998 .

[10]  Dagmar Sternad,et al.  Dynamic primitives of motor behavior , 2012, Biological Cybernetics.

[11]  Eizo Akiyama,et al.  Chaos in learning a simple two-person game , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Max Mulder,et al.  Objective Model Selection for Identifying the Human Feedforward Response in Manual Control , 2018, IEEE Transactions on Cybernetics.

[13]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[14]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[15]  Christos H. Papadimitriou,et al.  Cycles in adversarial regularized learning , 2017, SODA.

[16]  S. Hart,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[17]  Stefan Schaal,et al.  http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .

[18]  Sham M. Kakade,et al.  Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.

[19]  Lillian J. Ratliff,et al.  On the Convergence of Competitive, Multi-Agent Gradient-Based Learning , 2018, ArXiv.

[20]  Reza Shadmehr,et al.  Computational nature of human adaptive control during learning of reaching movements in force fields , 1999, Biological Cybernetics.

[21]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[22]  R. Brent Gillespie,et al.  Human control strategies in pursuit tracking with a disturbance input , 2014, 53rd IEEE Conference on Decision and Control.

[23]  R.Wade Allen,et al.  The man/machine control interface - Pursuit control , 1979, Autom..

[24]  Samuel A. Burden,et al.  Contributions of feedforward and feedback control in a manual trajectory-tracking task , 2019 .