A rational account of human memory search

Performing everyday tasks requires the ability to search through and retrieve past memories. A central paradigm to study human memory search is the semantic fluency task, where participants are asked to retrieve as many items as possible from a category (e.g. animals). Observed responses tend to be clustered semantically. To understand when our mind decides to switch from one cluster/patch to the next, recent work has proposed two competing mechanisms. Under the first switching mechanism, people make strategic decision to switch away from a depleted patch based on marginal value theorem, similar to optimal foraging in a spatial environment. The second switching mechanism demonstrates that similar behavior patterns can emerge using a random walk on a semantic network, without necessarily involving strategic switches. In the current work, instead of comparing competing switching mechanisms over observed human data, we propose a rational account of the problem by examining what would be the optimal patch-switching policy under the framework of reinforcement learning. The reinforcement learning agent, a Deep Q-Network (DQN), is built upon the random walk model and allows strategic switches based on features of the local semantic patch. After learning from rewards, the resulted policy of the agent gives rise to a third switching mechanism, which outperforms the previous two switching mechanisms. Our results provide theoretical justification of strategies used in human memory research, and shed light on how an optimal AI agent under realistic human constraints can generate hypothesis about human strategies in the same task.

[1]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[2]  Thomas A. Schreiber,et al.  The University of South Florida free association, rhyme, and word fragment norms , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[3]  H. Simon,et al.  Rationality as Process and as Product of Thought , 1978 .

[4]  James L. McClelland,et al.  Letting structure emerge: connectionist and dynamical systems approaches to cognition , 2010, Trends in Cognitive Sciences.

[5]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[6]  Thomas T. Hills,et al.  Optimal foraging in semantic memory. , 2012, Psychological review.

[7]  Joshua T. Abbott,et al.  Random walks on semantic networks can resemble optimal foraging. , 2015, Psychological review.

[8]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  P. Todd,et al.  Environments That Make Us Smart , 2007 .

[11]  G. Winocur,et al.  Clustering and switching as two components of verbal fluency: evidence from younger and older healthy adults. , 1997, Neuropsychology.

[12]  John R. Anderson,et al.  Human memory: An adaptive perspective. , 1989 .

[13]  E. Charnov Optimal foraging, the marginal value theorem. , 1976, Theoretical population biology.

[14]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[15]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[16]  Thomas T. Hills,et al.  Human foraging behavior : a virtual reality investigation on area restricted search in humans , 2010 .

[17]  Thomas T. Hills,et al.  Adaptive Lévy Processes and Area-Restricted Search in Human Foraging , 2013, PloS one.

[18]  Douglas L. T. Rohde,et al.  Language acquisition in the absence of explicit negative evidence: how important is starting small? , 1999, Cognition.

[19]  Y. Niv Reinforcement learning in the brain , 2009 .

[20]  John R. Anderson The Adaptive Character of Thought , 1990 .