Visual Memories and Mental Images

A model is presented for the architecture of the neural networks which encode visual information for storage and which reconstruct iconic representations from storage representations. (Iconic representations are geometrically similar to projections of the objects they represent.) Each storage representation consists of a sequence of patterns derived while the eyes fixate at different positions in the visual field. Each pattern in the sequence has three components: (1) a control component which describes both where the eyes fixated and the size of the attended scene fragment; (2) a surface quality component which describes visual surface characteristics of the object; and (3) a spatial component which describes the spatial extent, spatial position (depth), surface orientation and visual flow (movement) of the surface having the specified surface characteristics. Prior to storage, all spatial components are transformed using a complex logarithmic mapping. As a consequence, stored spatial patterns are not iconic representations of the scene fragments they represent. Also, storage representations can be recognized and reconstructed at any desired size and orientation: they are size and orientation invariant. During reconstruction, each pattern in the storage representation is transformed back into an iconic representation using a complex exponential mapping. One consequence of the combined complex logarithmic and exponential mappings and the limited size of the storage representations is that the fidelity of the recalled information degrades exponentially from its centre. A neural network, called spatial memory, not only holds the partially reconstructed representation during recall, but also shifts it to remain in registration with the fragment currently being recalled and combined. The control system uses the control component of each stored pattern and knowledge of the size and orientation of the reconstruction to determine how to shift the partially reconstructed representation in spatial memory. Due to the decreasing fidelity from the centre to the perimeter of each reconstructed scene fragment, spatial memory only preserves information from overlapping fragments having the highest fidelity. It does so by maintaining and using fidelity information for each position in the reconstructed representation. Spatial memory can maintain a current stable representation of the visual world. It can also magnify, reduce, shift and rotate representations. The representations are therefore independent of their position in spatial memory. It is suggested that the representations held and processed by spatial memory correspond to the representations we call mental images and for this reason they are called mental images in the model.

[1]  Ruzena Bajcsy,et al.  Texture gradient as a depth cue , 1976 .

[2]  John E. W. Mayhew,et al.  Psychophysical and Computational Studies Towards a Theory of Human Stereopsis , 1981, Artif. Intell..

[3]  Robert J. Baron,et al.  Mechanisms of Human Facial Recognition , 1981, Int. J. Man Mach. Stud..

[4]  W E Grimson,et al.  A computational theory of visual surface interpolation. , 1982, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[5]  Berthold K. P. Horn Image Intensity Understanding , 1975 .

[6]  E. Land The retinex theory of color vision. , 1977, Scientific American.

[7]  P Cavanagh,et al.  Size and Position Invariance in the Visual System , 1978, Perception.

[8]  J. Potter Scene segmentation using motion information , 1977 .

[9]  M Coltheart,et al.  Iconic memory: A reply to Professor Holding , 1975, Memory & cognition.

[10]  B. Funt A Parallel‐Process Model of Mental Rotation* , 1983 .

[11]  M. Just,et al.  Eye fixations and cognitive processes , 1976, Cognitive Psychology.

[12]  A Trehub,et al.  Neuronal model for stereoscopic vision. , 1978, Journal of theoretical biology.

[13]  Bruce R. Schatz,et al.  Computation of Immediate Texture Discrimination , 1977, IJCAI.

[14]  Robert J. Baron,et al.  A model for the elementary visual networks of the human brain , 1970 .

[15]  Marcel Adam Just,et al.  Semantic control of eye movements in picture scanning during sentence-picture verification , 1972 .

[16]  C. Gilbert Microcircuitry of the visual cortex. , 1983, Annual review of neuroscience.

[17]  L. Kaufman,et al.  Spontaneous fixation tendencies for visual forms , 1969 .

[18]  D. C. Essen,et al.  Visual areas of the mammalian cerebral cortex. , 1979 .

[19]  N. Mackworth,et al.  The gaze selects informative details within pictures , 1967 .

[20]  Jon A. Webb,et al.  Quaternions in Computer Vision and Robotics , 1982 .

[21]  John Ronald Kender,et al.  Shape from texture , 1981 .

[22]  Bryant A. Julstrom,et al.  A Model of Mental Imagery , 1985, Int. J. Man Mach. Stud..

[23]  H. Roitblat The meaning of representation in animal memory , 1982, Behavioral and Brain Sciences.

[24]  B Julesz,et al.  Experiments in the visual perception of texture. , 1975, Scientific American.

[25]  D H Holding,et al.  Sensory storage reconsidered , 1975, Memory & cognition.

[26]  W. Prinz Memory control of visual search , 1977 .

[27]  R. Haber The impending demise of the icon: A critique of the concept of iconic storage in visual information processing , 1983, Behavioral and Brain Sciences.

[28]  Steven P. Shwartz,et al.  On the demystification of mental imagery , 1979, Behavioral and Brain Sciences.

[29]  D. Hubel,et al.  Ferrier lecture - Functional architecture of macaque monkey visual cortex , 1977, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[30]  R. Haber Twenty years of haunting eidetic imagery: where's the ghost? , 1979, Behavioral and Brain Sciences.

[31]  J. Metzler,et al.  Mental Transformations: A Top-Down Analysis , 1977 .

[32]  D Marr,et al.  Early processing of visual information. , 1976, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[33]  Kent A. Stevens,et al.  The Visual Interpretation of Surface Contours , 1981, Artif. Intell..

[34]  R J Baron,et al.  BRAIN ARCHITECTURE AND MECHANISMS THAT UNDERLIE LANGUAGE: AN INFORMATION‐PROCESSING ANALYSIS , 1976, Annals of the New York Academy of Sciences.

[35]  Takeo Kanade,et al.  Using shadows in finding surface orientations , 1983, Comput. Vis. Graph. Image Process..

[36]  J. Jonides Voluntary versus automatic control over the mind's eye's movement , 1981 .

[37]  L. Nadel,et al.  Précis of O'Keefe & Nadel's The hippocampus as a cognitive map , 1979, Behavioral and Brain Sciences.

[38]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[39]  Stephen M. Kosslyn,et al.  A Simulation of Visual Imagery , 1977, Cogn. Sci..

[40]  Robert J. Baron,et al.  A Theory for the Neural Basis of Language. Part 1: A Neural Network Model , 1974, Int. J. Man Mach. Stud..

[41]  J. D. Gould Pattern recognition and eye-movement parameters , 1967 .

[42]  David Taenzer Physiology and Psychology of Color Vision -- A Review , 1976 .

[43]  Andrew P. Witkin,et al.  Recovering Surface Shape and Orientation from Texture , 1981, Artif. Intell..

[44]  Daryl T Lawton Motion Analysis via Local Translational Processing. , 1982 .

[45]  R N Haber,et al.  How we perceive depth from flat pictures. , 1980, American scientist.

[46]  Michael Brady,et al.  Computational Approaches to Image Understanding , 1982, CSUR.

[47]  J. D. Gould,et al.  Eye movements during visual search and memory search. , 1973, Journal of experimental psychology.

[48]  Carl F. R. Weiman,et al.  Logarithmic spiral grids for image-processing and display , 1979 .

[49]  S. Ullman Against direct perception , 1980, Behavioral and Brain Sciences.

[50]  J. D. Gould,et al.  Eye-movement parameters and pattern discrimination , 1969 .

[51]  Leo Maurice Hurvich,et al.  Color vision , 1981 .

[52]  Martin A. Fischler,et al.  Computational Stereo , 1982, CSUR.

[53]  W. Eric L. Grimson,et al.  Binocular shading and visual surface reconstruction , 1984, Comput. Vis. Graph. Image Process..

[54]  M A Just,et al.  The semantics of locative information in pictures and mental images. , 1975, British journal of psychology.

[55]  Ellen C. Hildreth,et al.  Measurement of Visual Motion , 1984 .

[56]  Takeo Kanade,et al.  Recovery of the Three-Dimensional Shape of an Object from a Single View , 1981, Artif. Intell..

[57]  Robert J. Baron,et al.  A Theory for the Neural Basis of Language: Part 2. Simulation Studies of the Model , 1974, Int. J. Man Mach. Stud..

[58]  David Marr An Essay on the Primate Retina , 1974 .

[59]  Thomas O. Binford,et al.  Inferring Surfaces from Images , 1981, Artif. Intell..

[60]  A. Kaneko Physiology of the retina. , 1979, Annual review of neuroscience.

[61]  E Pöppel,et al.  Neuronal mechanisms in visual perception. , 1977, Neurosciences Research Program bulletin.

[62]  S. Kosslyn,et al.  Imagery, propositions, and the form of internal representations , 1977, Cognitive Psychology.