I See What You See: Inferring Sensor and Policy Models of Human Real-World Motor Behavior

Human motor behavior is naturally guided by sensing the environment. To predict such sensori-motor behavior, it is necessary to model what is sensed and how actions are chosen based on the obtained sensory measurements. Although several models of human sensing haven been proposed, rarely data of the assumed sensory measurements is available. This makes statistical estimation of sensor models problematic. To overcome this issue, we propose an abstract structural estimation approach building on the ideas of Herman et al.’s Simultaneous Estimation of Rewards and Dynamics (SERD). Assuming optimal fusion of sensory information and rational choice of actions the proposed method allows to infer sensor models even in absence of data of the sensory measurements. To the best of our knowledge, this work presents the first general approach for joint inference of sensor and policy models. Furthermore, we consider its concrete implementation in the important class of sensor scheduling linear quadratic Gaussian problems. Finally, the effectiveness of the approach is demonstrated for prediction of the behavior of automobile drivers. Specifically, we model the glance and steering behavior of driving in the presence of visually demanding secondary tasks. The results show, that prediction benefits from the inference of sensor models. This is the case, especially, if also information is considered, that is contained in gaze switching behavior.

[1]  Brian D. Ziebart,et al.  Predictive Inverse Optimal Control for Linear-Quadratic-Gaussian Systems , 2015, AISTATS.

[2]  Mary M Hayhoe,et al.  Task and context determine where you look. , 2016, Journal of vision.

[3]  Heikki Summala,et al.  Maintaining Lane Position with Peripheral Vision during In-Vehicle Tasks , 1996, Hum. Factors.

[4]  Wei Zhang,et al.  On efficient sensor scheduling for linear dynamical systems , 2010, ACC 2010.

[5]  Anind K. Dey,et al.  Modeling Interaction via the Principle of Maximum Causal Entropy , 2010, ICML.

[6]  Csaba Szepesvári,et al.  Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.

[7]  Luigi Acerbi,et al.  A Framework for Testing Identifiability of Bayesian Models of Perception , 2014, NIPS.

[8]  Wolfram Burgard,et al.  Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics , 2016, AISTATS.

[9]  Masamichi Shimosaka,et al.  Modeling risk anticipation and defensive driving on residential roads with inverse reinforcement learning , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[10]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[11]  G. Stewart On the Perturbation of Pseudo-Inverses, Projections and Linear Least Squares Problems , 1977 .

[12]  David J. Cole,et al.  A review of human sensory dynamics for application to models of driver steering and speed control , 2016, Biological Cybernetics.

[13]  R Risack ROBUST LANE RECOGNITION EMBEDDED IN A REAL-TIME DRIVER ASSISTANCE SYSTEM , 1998 .

[14]  Wolfram Burgard,et al.  Learning driving styles for autonomous vehicles from demonstration , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Rainer Stiefelhagen,et al.  Predicting lane keeping behavior of visually distracted drivers using inverse suboptimal control , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[16]  Rainer Stiefelhagen,et al.  Exact Maximum Entropy Inverse Optimal Control for modeling human attention switching and control , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[17]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[18]  David L. Kleinman,et al.  The Human as an Optimal Controller and Information Processor , 1969 .

[19]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[20]  Ilana Segall,et al.  Identification of a modified optimal control model for the human operator , 1976, Autom..

[21]  W. Richards,et al.  Perception as Bayesian Inference , 2008 .

[22]  Byron M. Yu,et al.  Learning an Internal Dynamics Model from Control Demonstration , 2013, ICML.

[23]  Christos Dimitrakakis,et al.  Preference elicitation and inverse reinforcement learning , 2011, ECML/PKDD.

[24]  Jan Peters,et al.  Catching heuristics are optimal control policies , 2016, NIPS.