Nonparametric Problem-Space Clustering: Learning Efficient Codes for Cognitive Control Tasks

We present an information-theoretic method permitting one to find structure in a problem space (here, in a spatial navigation domain) and cluster it in ways that are convenient to solve different classes of control problems, which include planning a path to a goal from a known or an unknown location, achieving multiple goals and exploring a novel environment. Our generative nonparametric approach, called the generative embedded Chinese restaurant process (geCRP), extends the family of Chinese restaurant process (CRP) models by introducing a parameterizable notion of distance (or kernel) between the states to be clustered together. By using different kernels, such as the the conditional probability or joint probability of two states, the same geCRP method clusters the environment in ways that are more sensitive to different control-related information, such as goal, sub-goal and path information. We perform a series of simulations in three scenarios—an open space, a grid world with four rooms and a maze having the same structure as the Hanoi Tower—in order to illustrate the characteristics of the different clusters (obtained using different kernels) and their relative benefits for solving planning and control problems.

[1]  Pedro E. López-de-Teruel,et al.  Nonlinear kernel-based statistical pattern analysis , 2001, IEEE Trans. Neural Networks.

[2]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[3]  Karl J. Friston,et al.  A Hierarchy of Time-Scales and the Brain , 2008, PLoS Comput. Biol..

[4]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[5]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[6]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[7]  Paul F. M. J. Verschure,et al.  The why, what, where, when and how of goal-directed choice: neuronal and computational principles , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[8]  Aldo Genovesio,et al.  Encoding Goals but Not Abstract Magnitude in the Primate Prefrontal Cortex , 2012, Neuron.

[9]  Daniel Polani,et al.  Informational Constraints-Driven Organization in Goal-Directed Behavior , 2013, Adv. Complex Syst..

[10]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[11]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[12]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[13]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  C. W. Therrien,et al.  Decision, Estimation and Classification: An Introduction to Pattern Recognition and Related Topics , 1989 .

[15]  J. Lafferty,et al.  Time-Sensitive Dirichlet Process Mixture Models , 2005 .

[16]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[17]  Nils J. Nilsson,et al.  Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.

[18]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[19]  Giovanni Pezzulo,et al.  A Programmer-Interpreter Neural Network Architecture for Prefrontal Cognitive Control , 2015, Int. J. Neural Syst..

[20]  R. Passingham,et al.  The Neurobiology of the Prefrontal Cortex: Anatomy, Evolution, and the Origin of Insight , 2012 .

[21]  Giovanni Pezzulo,et al.  Divide et impera: subgoaling reduces the complexity of probabilistic inference and problem solving , 2015, Journal of The Royal Society Interface.

[22]  Lena Jaeger Problem Solving Methods In Artificial Intelligence , 2016 .

[23]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[24]  Stephen P. Borgatti,et al.  Centrality and network flow , 2005, Soc. Networks.

[25]  Matthijs A. A. van der Meer,et al.  Internally generated sequences in learning and executing goal-directed behavior , 2014, Trends in Cognitive Sciences.

[26]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[27]  M. Botvinick,et al.  Neural representations of events arise from temporal community structure , 2013, Nature Neuroscience.

[28]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[29]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[30]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[31]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[32]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[33]  John R. Anderson,et al.  The Adaptive Nature of Human Categorization , 1991 .

[34]  Daniel Polani,et al.  Information Theory of Decisions and Actions , 2011 .

[35]  G. Pezzulo,et al.  Thinking as the control of imagination: a conceptual framework for goal-directed systems , 2009, Psychological research.

[36]  Karl J. Friston,et al.  Active Inference, homeostatic regulation and adaptive behavioural control , 2015, Progress in Neurobiology.

[37]  G. W. Snedecor Statistical Methods , 1964 .

[38]  Chrystopher L. Nehaniv,et al.  Hierarchical Behaviours: Getting the Most Bang for Your Bit , 2009, ECAL.

[39]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[40]  Alec Solway,et al.  Reinforcement learning, efficient coding, and the statistics of natural tasks , 2015, Current Opinion in Behavioral Sciences.

[41]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[42]  Marcus Hutter,et al.  Universal Artificial Intellegence - Sequential Decisions Based on Algorithmic Probability , 2005, Texts in Theoretical Computer Science. An EATCS Series.

[43]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[44]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[45]  Alessandro Vespignani,et al.  Dynamical Processes on Complex Networks , 2008 .

[46]  Xiao-Jing Wang,et al.  The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.

[47]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[48]  Alec Solway,et al.  Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..

[49]  Dorothy Tse,et al.  References and Notes Supporting Online Material Materials and Methods Figs. S1 to S5 Tables S1 to S3 Electron Impact (ei) Mass Spectra Chemical Ionization (ci) Mass Spectra References Schemas and Memory Consolidation Research Articles Research Articles Research Articles Research Articles , 2022 .

[50]  E. Koechlin,et al.  Reasoning, Learning, and Creativity: Frontal Lobe Function and Human Decision-Making , 2012, PLoS biology.

[51]  Joachim M. Buhmann,et al.  Generative Embedding for Model-Based Classification of fMRI Data , 2011, PLoS Comput. Biol..

[52]  Etienne Koechlin,et al.  Foundations of human reasoning in the prefrontal cortex , 2014, Science.

[53]  J. Duncan The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviour , 2010, Trends in Cognitive Sciences.

[54]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[55]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[56]  L. Levin,et al.  THE COMPLEXITY OF FINITE OBJECTS AND THE DEVELOPMENT OF THE CONCEPTS OF INFORMATION AND RANDOMNESS BY MEANS OF THE THEORY OF ALGORITHMS , 1970 .

[57]  M. Botvinick Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.

[58]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[59]  David B. Dahl,et al.  Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics , 2017 .

[60]  Eero P. Simoncelli,et al.  Natural image statistics and neural representation. , 2001, Annual review of neuroscience.

[61]  Giovanni Pezzulo,et al.  Prefrontal Goal Codes Emerge as Latent States in Probabilistic Value Learning , 2016, Journal of Cognitive Neuroscience.

[62]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[63]  Daniel Polani,et al.  Grounding subgoals in information transitions , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).