论文信息 - A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

In the present paper, we propose a neurally-inspired model of the primate motion processing hierarchy and describe its implementation as a computer simulation. The model aims to explain how a hierarchical feedforward network consisting of neurons in the cortical areas V1, MT, MST, and 7a of primates achieves the detection of different kinds of motion patterns. Moreover, the model includes a feedback gating network that implements a biologically plausible mechanism of visual attention. This mechanism is used for sequential localization and fine-grained inspection of every motion pattern detected in the visual scene. 1 The Feedforward Mechanism of Motion Detection In the present paper, we propose a neurally-inspired model of the primate motion processing hierarchy and describe its implementation as a computer simulation. The model aims to explain how a hierarchical feed-forward network consisting of neurons in the cortical areas V1, MT, MST, and 7a of primates achieves the detection of different kinds of motion patterns. Cells in striate area V1 are well known to be tuned towards a particular local speed and direction of motion in at least three main speed ranges [1]. In the model, V1 neurons estimate local speed and direction in five-frame, 256×256 pixel image sequences using spatiotemporal filters (e.g., [2]). Their direction selectivity is restricted to 12 distinct, Gaussian-shaped tuning curves. Each tuning curve has a standard deviation of 30o and represents the selectivity for one of 12 different directions spaced 30o apart (0o, 30o, ..., 330o). V1 is represented by a 60×60 array of hypercolumns. The receptive fields (RFs) of V1 neurons are circular and homogeneously distributed across the visual field, with RFs of neighboring hypercolumns overlapping by 20%. In area MT a high proportion of cells are tuned towards a particular local speed and direction of movement, similar to direction and speed selective cells in V1 [3, 4]. A proportion of MT neurons are also selective for a particular angle between movement direction and spatial speed gradient [5]. Both types of neurons are represented in the MT layer of the model, which is a 30×30 array of hypercolumns. Each MT cell receives input from a 4×4 field of V1 neurons with the same direction and speed selectivity. Neurons in area MST are tuned to complex motion patterns: expand or approach, shrink or recede, rotation, with RFs covering most of the visual field [6, 7]. Two types of neurons are modeled: one type selective for translation (as in V1) and another type selective for spiral motion (clockwise and counterclockwise rotation, expansion, contraction and combinations). MST is simulated as a 5×5 array of hypercolumns. Each MST cell receives input from a large group (covering 60% of the visual field) of MT neurons that respond to a particular motion/gradient angle. Any coherent motion/gradient angle indicates a particular type of spiral motion. Finally, area 7a seems to involve at least four different types of computations [8]. Here, neurons are selective for translation and spiral motion as in MST, but they have even larger RFs. They are also selective for rotation (regardless of direction) and radial motion (regardless of direction). In the simulation, area 7a is represented by a 4×4 array of hypercolumns. Each 7a cell receives input from a 4×4 field of MST neurons that have the relevant tuning. Rotation cells and radial motion cells only receive input from MST neurons that respond to spiral motion involving any rotation or any radial motion, respectively. Fig. 1 shows the activation of neurons in the model as induced by a sample stimulus. Note that in the actual visualization different colors indicate the response to particular angles between motion and speed gradient in MT gradient neurons. In the present example, the gray levels indicate that the neurons selective for a 90o angle gave by far the strongest responses. A consistent 90o angle across all directions of motion signifies a pattern of clockwise rotation. Correspondingly, the maximum activation of the spiral neurons in areas MST and 7a corresponds to the clockwise rotation pattern (90o angle). Finally, area 7a also shows a substantial response to rotation in the medium-speed range, while there is no visible activation that would indicate radial motion. 2 The Feedback Mechanism of Visual Attention Most of the computational models of primate motion perception that have been proposed concentrate on bottom-up processing and do not address attentional issues. However, there is evidence that the responses of neurons in areas MT and MST can be modulated by attention (Treue & Maunsell, 1996). Moreover, we claim that attention is necessary for a precise localization of motion patterns in image sequences. As a result of the model’s feedforward computations, the neural responses in the highlevel areas (MST and 7a) roughly indicate the kind of motion patterns presented as an input but do not localize the spatial position of the patterns. In order to create a comprehensive motion model that is in agreement with biological findings and is capable of localizing motion patterns, we added a mechanism of visual attention to it. We decided to use the biologically plausible Selective Tuning approach [9], requiring the introduction of a feedback gating network to the model. Each neuron in the original motion hierarchy received an assembly of gating units that control the bottom-up information flow to that neuron.

[1] John K. Tsotsos,et al. Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[2] T. Meese,et al. Spiral mechanisms are required to account for summation of complex motion components , 2002, Vision Research.

[3] G. Orban,et al. Speed and direction selectivity of macaque middle temporal neurons. , 1993, Journal of neurophysiology.

[4] R A Andersen,et al. Neural responses to velocity gradients in macaque cortical area MT , 1996, Visual Neuroscience.

[5] L M Vaina,et al. Computational modelling of optic flow selectivity in MSTd neurons. , 1998, Network.

[6] Eero P. Simoncelli,et al. A model of neuronal responses in visual area MT , 1998, Vision Research.

[7] D. J. Felleman,et al. Receptive-field properties of neurons in middle temporal visual area (MT) of owl monkeys. , 1984, Journal of neurophysiology.

[8] G. Orban,et al. Velocity sensitivity and direction selectivity of neurons in areas V1 and V2 of the monkey: influence of eccentricity. , 1986, Journal of neurophysiology.

[9] John H. R. Maunsell,et al. Attentional modulation of visual motion processing in cortical areas MT and MST , 1996, Nature.

[10] Stephen Grossberg,et al. Neural dynamics of motion integration and segmentation within and across apertures , 2001, Vision Research.

[11] Winky Yan Kei Wai,et al. A Computational Model for Detecting Image Changes , 1994 .

[12] David J. Heeger,et al. Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[13] M. Graziano,et al. Tuning of MST neurons to spiral motions , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14] R. Wurtz,et al. Medial Superior Temporal Area Neurons Respond to Speed Patterns in Optic Flow , 1997, The Journal of Neuroscience.

[15] R. M. Siegel,et al. Analysis of optic flow in the monkey parietal area 7a. , 1997, Cerebral cortex.