Skilloscopy: Bayesian Modeling of Decision Makers' Skill

This paper proposes and demonstrates Skilloscopy as an approach to the assessment of decision makers. In an increasingly sophisticated, connected, and information-rich world, decision making is becoming both more important and more difficult. At the same time, modeling decision making on computers is becoming more feasible and of interest, partly because the information input to those decisions is increasingly on record. The aims of Skilloscopy are to rate and rank decision makers in a domain relative to each other; these aims do not include an analysis of why a decision is wrong or suboptimal, nor the modeling of the underlying cognitive process of making the decisions. In the proposed method, a decision-maker is characterized by a probability distribution of its competence in choosing among quantifiable alternatives. This probability distribution is derived by classic Bayesian inference from a combination of prior belief and evidence of the decisions. Thus, decision makers' skills may be better compared, rated, and ranked. The proposed method is applied and evaluated in the game domain of chess. A large set of games by players across a broad range of the World Chess Federation (FIDE) Elo ratings has been used to infer the distribution of players' rating directly from the moves they play rather than from game outcomes. Demonstration applications address questions frequently asked by the chess community regarding the stability of the Elo rating scale, the comparison of players of different eras and/or leagues, and controversial incidents possibly involving fraud. The method of Skilloscopy may be applied in any decision domain where the value of the decision-options can be quantified.

[1]  Thomas Hofmann,et al.  TrueSkill™: A Bayesian Skill Rating System , 2007 .

[2]  Giuseppe Di Fatta,et al.  Skill rating by Bayesian inference , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[3]  A. Takanishi,et al.  Development of Sensor System for Effective Evaluation of Surgical Skill , 2006, The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, 2006. BioRob 2006..

[4]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[5]  Stuart Dillon,et al.  Descriptive Decision Making : Comparing Theory with Practice , 1960 .

[6]  Tom Minka,et al.  TrueSkill Through Time: Revisiting the History of Chess , 2007, NIPS.

[7]  Charles C. Moul,et al.  Did the Soviets Collude? A Statistical Analysis of Championship Chess 1940-64 , 2006 .

[8]  Tom Minka,et al.  TrueSkillTM: A Bayesian Skill Rating System , 2006, NIPS.

[9]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[10]  O Haavisto,et al.  Data-based skill evaluation of human operators in process industry , 2010, ICCAS 2010.

[11]  Roger Hartley,et al.  The design and evaluation of simulations for the development of complex decision-making skills , 2001, Proceedings IEEE International Conference on Advanced Learning Technologies.

[12]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[13]  Rémi Coulom,et al.  Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength , 2008, Computers and Games.

[14]  Douglas L. Maskell,et al.  EpiList II: Closing the Loop in the Development of Generic Cognitive Skills , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[15]  D. Warner North,et al.  A Tutorial Introduction to Decision Theory , 1968, IEEE Trans. Syst. Sci. Cybern..

[16]  Guy Haworth Reference Fallible Endgame Play , 2003, J. Int. Comput. Games Assoc..

[17]  D. W. Repperger,et al.  Skill evaluation of human operators , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[18]  De Groot,et al.  Thought and choice in chess, 2nd ed. , 1978 .

[19]  Fernand Gobet,et al.  Chess players' thinking revisited , 1998 .

[20]  Detlof von Winterfeldt,et al.  Advances in decision analysis : from foundations to applications , 2007 .

[21]  Kajiro Watanabe,et al.  Kinematical analysis and measurement of sports form , 2006, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[22]  M. Glickman Parameter Estimation in Large Dynamic Paired Comparison Experiments , 1999 .

[23]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[24]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[25]  T WestBrady,et al.  A Simple and Flexible Rating Method for Predicting Success in the NCAA Basketball Tournament , 2006 .

[26]  A. Elo The rating of chessplayers, past and present , 1978 .

[27]  Howard Raiffa,et al.  Decision analysis: introductory lectures on choices under uncertainty. 1968. , 1969, M.D.Computing.

[28]  Guy Haworth Gentlemen, Stop Your Engines! , 2007, J. Int. Comput. Games Assoc..

[29]  L. Alem,et al.  Evaluation of learner's skills in the context of dynamic and complex systems , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[30]  Ivan Bratko,et al.  Computer Analysis of World Chess Champions , 2006, J. Int. Comput. Games Assoc..

[31]  J. Beasley The mathematics of games , 1989 .

[32]  E. D. Lorias,et al.  Computer System for the Evaluation of Laparoscopic Skills , 2010, 2010 IEEE Electronics, Robotics and Automotive Mechanics Conference.

[33]  Neil Charness,et al.  Expertise in chess , 2006 .