What to simulate? Inferring the right direction for mental rotation

What to simulate? Inferring the right direction for mental rotation Jessica B. Hamrick (jhamrick@berkeley.edu) Thomas L. Griffiths (tom griffiths@berkeley.edu) Department of Psychology, University of California, Berkeley, CA 94720 USA Abstract experiment with each of the models. We conclude with a dis- cussion of the strengths and weaknesses of each model, and lay out directions for future work. When people use mental imagery, how do they decide which images to generate? To answer this question, we explored how mental simulation should be used in the classic psychological task of determining if two images depict the same object in different orientations (Shepard & Metzler, 1971). Through a rational analysis of mental rotation, we formalized four mod- els and compared them to human performance. We found that three models based on previous hypotheses in the literature were unable to account for several aspects of human behavior. The fourth is based on the idea active sampling (e.g., Gureckis & Markant, 2012), which is a strategy of choosing actions that will provide the most information. This last model provides a plausible account of how people use mental rotation, where the other models do not. Based on these results, we suggest that the question of “what to simulate?” is more difficult than has previously been assumed, and that an active learning approach holds promise for uncovering the answer. Modeling mental rotation Previous models of mental rotation have largely focused on the representation of mental images, rather than how peo- ple decide which mental images to generate. Kosslyn and Shwartz (1977) proposed a model of the mental imagery buffer, but did not say how it should be used. Similarly, Julstrom and Baron (1985) and Glasgow and Papadias (1992) were mostly concerned with modeling the representational format underlying imagery. Although Anderson (1978) em- phasized the importance of considering both representation and process, he dismissed the problem of determining the di- rection of rotation as a “technical difficulty”. The only models (of which the authors are aware) that se- riously attempted to address the decision of what to simu- late are those by Funt (1983) and Just and Carpenter (1985). In both of these models, the axis and direction of rotation are computed prior to performing the rotation. One object is then rotated through the target rotation, and is checked against the other object for congruency. However, this approach as- sumes that the corresponding points on the two objects can be easily identified, which is not necessarily the case. Indeed, the state-of-the-art in computer vision suggests that there is more to this problem than checking for congruency, partic- ularly when the shapes are complex or not exactly the same (e.g., Belongie, Malik, & Puzicha, 2002; Sebastian, Klein, & Kimia, 2003). Additionally, recent research shows that when performing physical rotations, people do not rotate until con- gruency is reached; they may even rotate away from near per- fect matches (Gardony, Taylor, & Brunye, 2014). If people are not computing the rotation beforehand, what might they be doing? To answer this question, we perform a rational analysis of the problem of mental rotation (Marr, 1983; Anderson, 1990; Shepard, 1987). At the computa- tional level, we can say that the problem is to determine which spatial transformations an object has undergone based on two images of that object (which do not include informa- tion about point correspondences). At the algorithmic level, we are constrained by the notion that mental images must be transformed in an analog manner (or in a way that is approx- imately analog), and that mental images are time-consuming and effortful to generate. Thus, the goal is to make this de- termination while performing a minimum amount of compu- tation (i.e., as few rotations as possible). The original “congruency” hypothesis (Shepard & Met- zler, 1971) is a rational solution to this problem, in the sense Keywords: mental rotation, computational modeling Introduction One of the most astonishing cognitive feats is our ability to envision, manipulate, and plan with objects—all without actually perceiving them. This mental simulation has been widely studied, including an intense debate about the under- lying representation of mental images (e.g., Kosslyn, Thomp- son, & Ganis, 2009; Pylyshyn, 2002). But this debate hasn’t addressed one of the most fundamental questions about men- tal simulation: how people decide what to simulate. Mental rotation provides a simple example of the decision problem posed by simulation. In the classic experiment by Shepard and Metzler (1971), participants viewed images of three-dimensional objects and had to determine whether the images depicted the same object (which differed by a rota- tion) or two separate objects (which differed by a reflection and a rotation). They found that people’s response times (RTs) had a strong linear correlation with the minimum an- gle of rotation, a result which led to the conclusion that peo- ple solve this task by “mentally rotating” the objects until they are congruent. However, this explanation leaves several questions unanswered. How do people know the axis around which to rotate the objects? If the axis is known, how do peo- ple know which direction to rotate the objects? And finally, how do people know how long to rotate? In this paper, we explore these questions through rational analysis (Marr, 1983; Anderson, 1990; Shepard, 1987) and compare four models of mental rotation. We begin the pa- per by discussing the previous literature on mental imagery. Next, we outline computational- and algorithmic-level anal- yses of the problem of mental rotation. We then describe a behavioral experiment based on the classic mental rotation studies (e.g., Cooper, 1975), and compare the results of our

[1]  R. Shepard,et al.  Mental Rotation of Three-Dimensional Objects , 1971, Science.

[2]  L. Cooper Mental rotation of random two-dimensional shapes , 1975, Cognitive Psychology.

[3]  M. Just,et al.  Eye fixations and cognitive processes , 1976, Cognitive Psychology.

[4]  Stephen M. Kosslyn,et al.  A Simulation of Visual Imagery , 1977, Cogn. Sci..

[5]  John R. Anderson Arguments concerning representations for mental imagery. , 1978 .

[6]  J. H. Steiger,et al.  Nonholistic processing in mental rotation: Some suggestive evidence , 1982, Perception & psychophysics.

[7]  Brian V. Funt,et al.  A Parallel-Process Model of Mental Rotation , 1983, Cogn. Sci..

[8]  Bryant A. Julstrom,et al.  A Model of Mental Imagery , 1985, Int. J. Man Mach. Stud..

[9]  M. Just,et al.  Cognitive coordinate systems: accounts of mental rotation and individual differences in spatial ability. , 1985, Psychological review.

[10]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.

[11]  S. Gupta,et al.  Statistical decision theory and related topics IV , 1988 .

[12]  P. Diaconis Bayesian Numerical Analysis , 1988 .

[13]  John R. Anderson The Adaptive Character of Thought , 1990 .

[14]  D. Papadias,et al.  Computational Imagery , 1992, Cogn. Sci..

[15]  Refractor Vision , 2000, The Lancet.

[16]  Z. Pylyshyn Mental imagery: In search of a theory , 2002, Behavioral and Brain Sciences.

[17]  Philip N. Klein,et al.  On Aligning Curves , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[19]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[20]  Mordechai Juni,et al.  Don't Stop 'Til You Get Enough: Adaptive Information Sampling in a Visuomotor Estimation Task , 2011, CogSci.

[21]  Carl E. Rasmussen,et al.  Active Learning of Model Evidence Using Bayesian Quadrature , 2012, NIPS.

[22]  Tad T. Brunyé,et al.  What Does Physical Rotation Reveal About Mental Rotation? , 2014, Psychological science.

[23]  Anne Kuefer,et al.  The Case For Mental Imagery , 2016 .