Statistical templates for visual search.

How do we find a target embedded in a scene? Within the framework of signal detection theory, this task is carried out by comparing each region of the scene with a "template," i.e., an internal representation of the search target. Here we ask what form this representation takes when the search target is a complex image with uncertain orientation. We examine three possible representations. The first is the matched filter. Such a representation cannot account for the ease with which humans can find a complex search target that is rotated relative to the template. A second representation attempts to deal with this by estimating the relative orientation of target and match and rotating the intensity-based template. No intensity-based template, however, can account for the ability to easily locate targets that are defined categorically and not in terms of a specific arrangement of pixels. Thus, we define a third template that represents the target in terms of image statistics rather than pixel intensities. Subjects performed a two-alternative, forced-choice search task in which they had to localize an image that matched a previously viewed target. Target images were texture patches. In one condition, match images were the same image as the target and distractors were a different image of the same textured material. In the second condition, the match image was of the same texture as the target (but different pixels) and the distractor was an image of a different texture. Match and distractor stimuli were randomly rotated relative to the target. We compared human performance to pixel-based, pixel-based with rotation, and statistic-based search models. The statistic-based search model was most successful at matching human performance. We conclude that humans use summary statistics to search for complex visual targets.

[1]  Jana Reinhard,et al.  Textures A Photographic Album For Artists And Designers , 2016 .

[2]  Craig K. Abbey,et al.  A Practical Guide to Model Observers for Visual Detection in Synthetic and Natural Noisy Images , 2000 .

[3]  Jiri Najemnik,et al.  Eye movement statistics in humans are consistent with an optimal search strategy. , 2008, Journal of vision.

[4]  L. Chalupa,et al.  The visual neurosciences , 2004 .

[5]  Miguel P Eckstein,et al.  Frequency tuning of perceptual templates changes with noise magnitude. , 2009, Journal of the Optical Society of America. A, Optics, image science, and vision.

[6]  J. M. Foley,et al.  Contrast detection and near-threshold discrimination in human vision , 1981, Vision Research.

[7]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[8]  P. O. Bishop,et al.  Spatial vision. , 1971, Annual review of psychology.

[9]  A. Bovik,et al.  Visual search in noise: revealing the influence of structural cues by gaze-contingent classification image analysis. , 2006, Journal of vision.

[10]  R. Shepard,et al.  Mental Rotation of Three-Dimensional Objects , 1971, Science.

[11]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[12]  N. Graham Visual Pattern Analyzers , 1989 .

[13]  A E Burgess,et al.  Visual signal detection. IV. Observer inconsistency. , 1988, Journal of the Optical Society of America. A, Optics and image science.

[14]  Wilson S. Geisler,et al.  Simple summation rule for optimal fixation selection in visual search , 2009, Vision Research.

[15]  R. F. Wagner,et al.  Efficiency of human visual signal discrimination. , 1981, Science.

[16]  Edward H. Adelson,et al.  Shiftable multiscale transforms , 1992, IEEE Trans. Inf. Theory.

[17]  M. Landy,et al.  Properties of second-order spatial frequency channels , 2002, Vision Research.

[18]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[19]  R. Rosenholtz,et al.  A summary statistic representation in peripheral vision explains visual search. , 2009, Journal of vision.

[20]  Keiji Tanaka,et al.  Functional architecture in monkey inferotemporal cortex revealed by in vivo optical imaging , 1998, Neuroscience Research.

[21]  D G Pelli,et al.  The VideoToolbox software for visual psychophysics: transforming numbers into movies. , 1997, Spatial vision.

[22]  A Burgess,et al.  Visual signal detection. III. On Bayesian use of prior knowledge and cross correlation. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[23]  Benjamin J. Balas,et al.  Texture synthesis and perception: Using computational models to study texture representations in the human visual system , 2006, Vision Research.

[24]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[25]  D. Pelli,et al.  The uncrowded window of object recognition , 2008, Nature Neuroscience.

[26]  W. Geisler,et al.  Contributions of ideal observer theory to vision research , 2011, Vision Research.

[27]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Wilson S. Geisler,et al.  Optimal eye movement strategies in visual search , 2005, Nature.

[29]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[30]  P. Verghese Visual Search and Attention A Signal Detection Theory Approach , 2001, Neuron.

[31]  Krista A. Ehinger,et al.  Rethinking the Role of Top-Down Attention in Vision: Effects Attributable to a Lossy Representation in Peripheral Vision , 2011, Front. Psychology.

[32]  B. Dosher,et al.  Characterizing observers using external noise and observer models: assessing internal representations with external noise. , 2008, Psychological review.

[33]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[34]  G. Legge A power law for contrast discrimination , 1981, Vision Research.

[35]  A E Burgess,et al.  Visual signal detection. II. Signal-location identification. , 1984, Journal of the Optical Society of America. A, Optics and image science.

[36]  R. Rosenholtz,et al.  A summary-statistic representation in peripheral vision explains visual crowding. , 2009, Journal of vision.

[37]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[38]  Frans W Cornelissen,et al.  The Eyelink Toolbox: Eye tracking with MATLAB and the Psychophysics Toolbox , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[39]  Michael S. Landy,et al.  Visual perception of texture , 2002 .

[40]  L. Cooper Demonstration of a mental analog of an external rotation , 1976 .

[41]  H. Akaike A new look at the statistical model identification , 1974 .

[42]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[43]  Refractor Vision , 2000, The Lancet.

[44]  Eileen Kowler,et al.  Anticipatory smooth eye movements with random-dot kinematograms. , 2012, Journal of vision.

[45]  Preeti Verghese,et al.  The psychophysics of visual search , 2000, Vision Research.

[46]  Eero P. Simoncelli,et al.  Metamers of the ventral stream , 2011, Nature Neuroscience.

[47]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[48]  J. Robson,et al.  Application of fourier analysis to the visibility of gratings , 1968, The Journal of physiology.

[49]  H. BOUMA,et al.  Interaction Effects in Parafoveal Letter Recognition , 1970, Nature.

[50]  C Blakemore,et al.  On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images , 1969, The Journal of physiology.

[51]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[52]  W. Geisler,et al.  Optimal Eye Movement Strategies in Visual Search ( Supplement ) , 2005 .