The Human Kernel

Bayesian nonparametric models, such as Gaussian processes, provide a compelling framework for automatic statistical modelling: these models have a high degree of flexibility, and automatically calibrated complexity. However, automating human expertise remains elusive; for example, Gaussian processes with standard kernels struggle on function extrapolation problems that are trivial for human learners. In this paper, we create function extrapolation problems and acquire human responses, and then design a kernel learning framework to reverse engineer the inductive biases of human learners across a set of behavioral experiments. We use the learned kernels to gain psychological insights and to extrapolate in human-like ways that go beyond traditional stationary and polynomial kernels. Finally, we investigate Occam's razor in human and Gaussian process based function learning.

[1]  J. Carroll FUNCTIONAL LEARNING: THE LEARNING OF CONTINUOUS FUNCTIONAL MAPPINGS RELATING STIMULUS AND RESPONSE CONTINUA , 1963 .

[2]  Temple F. Smith Occam's razor , 1980, Nature.

[3]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[4]  D. Meyer,et al.  Function learning: induction of continuous stimulus-response relations. , 1991, Journal of experimental psychology. Learning, memory, and cognition.

[5]  D. Meyer,et al.  Function learning: induction of continuous stimulus-response relations. , 1991, Journal of experimental psychology. Learning, memory, and cognition.

[6]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[7]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[8]  M. McDaniel,et al.  Extrapolation: the sine qua non for abstraction in function learning. , 1997, Journal of experimental psychology. Learning, memory, and cognition.

[9]  Christopher K. I. Williams Computation with Infinite Neural Networks , 1998, Neural Computation.

[10]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[11]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[12]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[13]  J. Busemeyer,et al.  Learning Functional Relations Based on Experience With Input-Output Pairs by Humans and Artificial Neural Networks , 2005 .

[14]  M. McDaniel,et al.  The conceptual basis of function learning and extrapolation: Comparison of rule-based and associative-based models , 2005, Psychonomic bulletin & review.

[15]  J. Tenenbaum,et al.  Optimal Predictions in Everyday Cognition , 2006, Psychological science.

[16]  Rajesh P. N. Rao,et al.  Bayesian brain : probabilistic approaches to neural coding , 2006 .

[17]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[18]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[19]  T. Griffiths,et al.  Iterated learning: Intergenerational knowledge transmission reveals inductive biases , 2007, Psychonomic bulletin & review.

[20]  Sophie Denève,et al.  Bayesian Spiking Neurons I: Inference , 2008, Neural Computation.

[21]  Thomas L. Griffiths,et al.  Modeling human function learning with Gaussian processes , 2008, NIPS.

[22]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[23]  Daniel R. Little,et al.  Simplicity Bias in the Estimation of Causal Functions , 2009 .

[24]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[25]  Adam N. Sanborn,et al.  Bridging Levels of Analysis for Probabilistic Models of Cognition , 2012 .

[26]  Christopher G. Lucas,et al.  Superspace extrapolation reveals inductive biases in function learning , 2012, CogSci.

[27]  Joshua B. Tenenbaum,et al.  Multistability and Perceptual Inference , 2012, Neural Computation.

[28]  Andrew Gordon Wilson,et al.  Gaussian Process Kernels for Pattern Discovery and Extrapolation , 2013, ICML.

[29]  Thomas L. Griffiths,et al.  One and Done? Optimal Decisions From Very Few Samples , 2014, Cogn. Sci..

[30]  Frank C. Keil,et al.  Simplicity and Goodness-of-Fit in Explanation: The Case of Intuitive Curve-Fitting , 2014, CogSci.

[31]  Andrew Gordon Wilson,et al.  Fast Kernel Learning for Multidimensional Pattern Extrapolation , 2014, NIPS.

[32]  Adam Binch,et al.  Perception as Bayesian Inference , 2014 .

[33]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[34]  Christopher G. Lucas,et al.  A rational model of function learning , 2015, Psychonomic Bulletin & Review.