Are You Doing What I Think You Are Doing? Criticising Uncertain Agent Models

The key for effective interaction in many multiagent applications is to reason explicitly about the behaviour of other agents, in the form of a hypothesised behaviour. While there exist several methods for the construction of a behavioural hypothesis, there is currently no universal theory which would allow an agent to contemplate the correctness of a hypothesis. In this work, we present a novel algorithm which decides this question in the form of a frequentist hypothesis test. The algorithm allows for multiple metrics in the construction of the test statistic and learns its distribution during the interaction process, with asymptotic correctness guarantees. We present results from a comprehensive set of experiments, demonstrating that the algorithm achieves high accuracy and scalability at low computational costs.

[1]  Sandra Carberry,et al.  Techniques for Plan Recognition , 2001, User Modeling and User-Adapted Interaction.

[2]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[3]  David Carmel,et al.  Exploration Strategies for Model-based Learning in Multi-agent Systems: Exploration Strategies , 1999, Autonomous Agents and Multi-Agent Systems.

[4]  Jacob W. Crandall,et al.  An Empirical Study on the Practical Impact of Prior Beliefs over Policy Types , 2015, AAAI.

[5]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[6]  Boris Ryabko,et al.  On hypotheses testing for ergodic processes , 2008, 2008 IEEE Information Theory Workshop.

[7]  Kim G. Larsen,et al.  Bisimulation through Probabilistic Testing , 1991, Inf. Comput..

[8]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[9]  Cosma Rohilla Shalizi,et al.  Philosophy and the practice of Bayesian statistics. , 2010, The British journal of mathematical and statistical psychology.

[10]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.

[11]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[12]  Alfred Kobsa User Modeling and User-Adapted Interaction , 2005, User Modeling and User-Adapted Interaction.

[13]  H. Fischer A History of the Central Limit Theorem: From Classical to Modern Probability Theory , 2010 .

[14]  H. Peyton Young,et al.  Learning, hypothesis testing, and Nash equilibrium , 2003, Games Econ. Behav..

[15]  A. Azzalini A class of distributions which includes the normal ones , 1985 .

[16]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[17]  O. H. Brownlee,et al.  ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .

[18]  Aki Vehtari,et al.  A survey of Bayesian predictive methods for model assessment, selection and comparison , 2012 .

[19]  Yue Gao,et al.  Learning more powerful test statistics for click-based retrieval evaluation , 2010, SIGIR.

[20]  M. J. Bayarri,et al.  P Values for Composite Null Models , 2000 .

[21]  A. O'Hagan,et al.  Bayes estimation subject to uncertainty about parameter constraints , 1976 .

[22]  Xiao-Li Meng,et al.  Posterior Predictive $p$-Values , 1994 .

[23]  J. I The Design of Experiments , 1936, Nature.

[24]  Robert P. Goldman,et al.  A Bayesian Model of Plan Recognition , 1993, Artif. Intell..

[25]  Subramanian Ramamoorthy,et al.  On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems , 2014, UAI.

[26]  Itzhak Gilboa,et al.  A theory of case-based decisions , 2001 .