Information-theoretic approach to interactive learning

The principles of statistical mechanics and information theory play an important role in learning and have inspired both theory and the design of numerous machine learning algorithms. The new aspect in this paper is a focus on integrating feedback from the learner. A quantitative approach to interactive learning and adaptive behavior is proposed, integrating model- and decision-making into one theoretical framework. This paper follows simple principles by requiring that the observer's world model and action policy should result in maximal predictive power at minimal complexity. Classes of optimal action policies and of optimal models are derived from an objective function that reflects this trade-off between prediction and complexity. The resulting optimal models then summarize, at different levels of abstraction, the process's causal organization in the presence of the learner's actions. A fundamental consequence of the proposed principle is that the learner's optimal action policies balance exploration and control as an emerging property. Interestingly, the explorative component is present in the absence of policy randomness, i.e. in the optimal deterministic behavior. This is a direct result of requiring maximal predictive power in the presence of feedback.

[1]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[2]  S. Addelman Statistics for experimenters , 1978 .

[3]  Editors , 1986, Brain Research Bulletin.

[4]  Young,et al.  Inferring statistical complexity. , 1989, Physical review letters.

[5]  Rose,et al.  Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.

[6]  Stewart W. Wilson,et al.  A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .

[7]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[8]  T. Watkin,et al.  THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[9]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[10]  J. Crutchfield,et al.  Thermodynamic depth of causal states: Objective complexity via minimal representations , 1999 .

[11]  M. Opper,et al.  Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[14]  Christian Van den Broeck,et al.  Statistical Mechanics of Learning , 2001 .

[15]  R Urbanczik,et al.  Universal learning curves of support vector machines. , 2001, Physical review letters.

[16]  Manfred Opper,et al.  Statistical mechanics of learning: a variational approach for real data. , 2002, Physical review letters.

[17]  William Bialek,et al.  How Many Clusters? An Information-Theoretic Perspective , 2003, Neural Computation.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[20]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[21]  C. Rui Optimal predictive inference in log -Gaussian random fields , 2009 .

[22]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[23]  Doina Precup,et al.  An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.