A biologically-inspired top-down learning model based on visual attention is proposed in this paper. Low-level visual features are extracted from learning object itself and do not depend on the background information. All the features are expressed as a feature vector, which is looked as a random variable following a normal distribution. So every learning object is represented as the mean and standard deviation. All the learning objects are combined as an object class, which is represented as class’s mean and class’s standard deviation stored in long-term memory (LTM). Then the learned knowledge is used to find the similar location in an attended image. Experimental results indicate that: when the attended object doesn’t always appear in the background similar to that in the learning objects or their combinations change hugely between learning images and attended images, our model is excellent to the top-down approach of VOCUS and NavaIPakkam’s statistical model.
[1]
A. Hollingworth.
Constructing visual representations of natural scenes: the roles of short- and long-term visual memory.
,
2004,
Journal of experimental psychology. Human perception and performance.
[2]
Christof Koch,et al.
Feature combination strategies for saliency-based visual attention systems
,
2001,
J. Electronic Imaging.
[3]
L. Itti,et al.
Modeling the influence of task on attention
,
2005,
Vision Research.
[4]
Simone Frintrop,et al.
VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search
,
2006,
Lecture Notes in Computer Science.
[5]
M. Pietikäinen,et al.
TEXTURE ANALYSIS WITH LOCAL BINARY PATTERNS
,
2004
.
[6]
Christof Koch,et al.
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
,
2009
.