Latent-Variable MDP Models for Adapting the Interaction Environment of Diverse Users

Interactive interfaces are a common feature of many systems ranging from field robotics to video games. In many applications, these interfaces are targeted at a diverse and heterogeneous set of potential users and correspondingly there is substantial variety in what types of interfaces can be made available (e.g., KEYBOARD vs. joysticks) and how they may be configured (e.g., sensitivity levels). In this paper we are interested in solving the problem of personalizing an interface such that it adapts to present the user with a variation that is optimal with respect to their traits, such as skill level. We pose this problem by modelling the user as a parametrized Markov Decision Process (MDP), wherein the transition dynamics within a task depend on the latent skill level of the user. This notion of adapting at the level of action sets, picking optimally from a potentially very large space of action sets, is novel and provides a natural solution to the interface personalization problem. Our solution to this problem involves a latent variable formulation wherein there are hidden personality traits, such as a user skill-level, which are implicitly associated with a specific (optimal) choice of action set. We present an algorithm that iteratively eliminates potential action sets to quickly arrive at the action set optimally suited to a particular user. We evaluate this procedure in an experiment involving a simulated remote navigation domain, demonstrating that the combination of latent variable user models and type elimination outperforms baselines that do not model the diversity of user actions in this way.

[1]  Krzysztof Z. Gajos,et al.  Improving the performance of motor-impaired users with automatically-generated, ability-based interfaces , 2008, CHI.

[2]  Finale Doshi-Velez,et al.  The Infinite Partially Observable Markov Decision Process , 2009, NIPS.

[3]  Dit-Yan Yeung,et al.  Hidden-mode Markov decision processes , 1999, IJCAI 1999.

[4]  Rachid Alami,et al.  A Human Aware Mobile Robot Motion Planner , 2007, IEEE Transactions on Robotics.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Desney S. Tan,et al.  Market user interface design , 2012, EC '12.

[7]  Guy Shani,et al.  An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[8]  Emilio Frazzoli,et al.  Intention-Aware Motion Planning , 2013, WAFR.

[9]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[10]  Alan Fern,et al.  A Computational Decision Theory for Interactive Assistants , 2010, Interactive Decision Theory and Game Theory.

[11]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[12]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[13]  E. Ordentlich,et al.  Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[16]  David Maxwell Chickering,et al.  Personalizing influence diagrams: applying online learning strategies to dialogue management , 2006, User Modeling and User-Adapted Interaction.

[17]  Claudia V. Goldman,et al.  Towards Adapting Cars to their Drivers , 2012, AI Mag..

[18]  Achim Klenke,et al.  Probability theory - a comprehensive course , 2008, Universitext.

[19]  David C. Parkes,et al.  A General Approach to Environment Design with One Agent , 2009, IJCAI.

[20]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[21]  David Hsu,et al.  Planning under Uncertainty for Robotic Tasks with Mixed Observability , 2010, Int. J. Robotics Res..

[22]  Jérôme Lang,et al.  Purely Epistemic Markov Decision Processes , 2007, AAAI.

[23]  Gerhard Friedrich,et al.  Recommender Systems - An Introduction , 2010 .

[24]  Robert P. Goldman,et al.  A Probabilistic Model of Plan Recognition , 1991, AAAI.