Foveated Image and Video Processing and Search

Abstract In this article, we present an overview of current techniques used for foveating images and video. We provide a brief introduction to relevant aspects of the human visual system and how they motivate the idea of foveated, or variable resolution, images. We explore the idea of foveation as a perceptually lossless compression scheme and how foveation can be integrated into modern video coding algorithms. We also explore foveation as an efficient processing scheme that is useful when visual resources, such as bandwidth and computation, are constrained. We discuss algorithms for performing visual tasks, such as search and detection, with foveated imaging systems. We examine human behavior when performing similar visual tasks to gain insight into how best to design algorithms for these tasks. We look at some of the more promising applications of foveated video, such as teleoperation, as well as open issues and problems, including fixation selection. We then conclude with what we believe are the future trends of this fascinating area of research.

[1]  Jason A. Droll,et al.  Task demands control acquisition and storage of visual information. , 2005, Journal of experimental psychology. Human perception and performance.

[2]  Wilson S. Geisler,et al.  Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[3]  David L. Donoho,et al.  WaveLab and Reproducible Research , 1995 .

[4]  Gordon W. Roberts,et al.  A foveated image sensor in standard CMOS technology , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[5]  D. Hubel,et al.  Shape and arrangement of columns in cat's striate cortex , 1963, The Journal of physiology.

[6]  Iain D. Gilchrist,et al.  Investigating a space-variant weighted salience account of visual selection , 2007, Vision Research.

[7]  John D. Villasenor,et al.  Visibility of wavelet quantization noise , 1997, IEEE Transactions on Image Processing.

[8]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Alan C. Bovik,et al.  Real-time foveation techniques for low bit rate video coding , 2003, Real Time Imaging.

[10]  Dana H. Ballard,et al.  Modeling embodied visual behaviors , 2007, TAP.

[11]  P Reinagel,et al.  Natural scene statistics at the centre of gaze. , 1999, Network.

[12]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[13]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[14]  Patrick Le Callet,et al.  Subjective quality assessment IRCCyN/IVC database , 2004 .

[15]  Wilson S. Geisler,et al.  Optimal eye movement strategies in visual search , 2005, Nature.

[16]  Alan C Bovik,et al.  Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy. , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[17]  Zhou Wang,et al.  Foveated wavelet image quality index , 2001, Optics + Photonics.

[18]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[19]  B. Dierickx,et al.  CMOS foveated image sensor: signal scaling and small geometry effects , 1997 .

[20]  Wilson S. Geisler,et al.  Visual detection following retinal damage: predictions of an inhomogeneous retino-cortical model , 1996, Photonics West.

[21]  James J. Clark,et al.  A transformation method for the reconstruction of functions from nonuniformly spaced samples , 1985, IEEE Trans. Acoust. Speech Signal Process..

[22]  Timothy G. Constandinou,et al.  An adaptable foveating vision chip , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[23]  D. Ballard,et al.  Modelling the role of task in the control of gaze , 2009, Visual cognition.

[24]  Alan C. Bovik,et al.  Motion estimation and compensation for foveated video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[25]  B. P. Lathi Signal Processing And Linear Systems , 1998 .

[26]  Horace Barlow,et al.  Understanding Natural Vision , 1983 .

[27]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[28]  Yehoshua Y. Zeevi,et al.  Nonuniform sampling and antialiasing in image representation , 1993, IEEE Trans. Signal Process..

[29]  Zhou Wang,et al.  Foveation scalable video coding with automatic fixation selection , 2003, IEEE Trans. Image Process..

[30]  Alan C. Bovik,et al.  Point-of-gaze analysis reveals visual search strategies , 2004, IS&T/SPIE Electronic Imaging.

[31]  Mark A. Massie,et al.  Operational and performance comparisons between conventional and foveating large format infrared focal plane arrays , 2005, SPIE Defense + Commercial Sensing.

[32]  Iulian B. Ciocoiu Foveated Compressed Sensing , 2011, ECCTD.

[33]  Constantin A Rothkopf,et al.  Image statistics at the point of gaze during human navigation , 2009, Visual Neuroscience.

[34]  Umesh Rajashekar,et al.  DOVES: a database of visual eye movements. , 2009, Spatial vision.

[35]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[36]  Alan C. Bovik,et al.  A comparison of foveated acquisition and tracking performance relative to uniform resolution approaches , 2009, Defense + Commercial Sensing.

[37]  J. Robson,et al.  Probability summation and regional variation in contrast sensitivity across the visual field , 1981, Vision Research.

[38]  Ronald Larcom,et al.  Foveated image formation through compressive sensing , 2010, 2010 IEEE Southwest Symposium on Image Analysis & Interpretation (SSIAI).

[39]  R. Johansson,et al.  Eye–Hand Coordination in Object Manipulation , 2001, The Journal of Neuroscience.

[40]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[41]  G. Zelinsky A theory of eye movements during target acquisition. , 2008, Psychological review.

[42]  D. Ballard,et al.  Eye movements in natural behavior , 2005, Trends in Cognitive Sciences.

[43]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[44]  Peter J. Burt,et al.  Smart sensing within a pyramid vision machine , 1988, Proc. IEEE.

[45]  Hojin Ha,et al.  Perceptually Scalable Extension of H.264 , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[46]  Nikolay N. Ponomarenko,et al.  TID2008 – A database for evaluation of full-reference visual quality assessment metrics , 2004 .

[47]  J. Daugman Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[48]  Simon Farrell,et al.  Influence of environmental statistics on inhibition of saccadic return , 2009, Proceedings of the National Academy of Sciences.

[49]  Alan C. Bovik,et al.  Fast algorithms for foveated video processing , 2003, IEEE Trans. Circuits Syst. Video Technol..

[50]  William A. Pearlman,et al.  A new, fast, and efficient image codec based on set partitioning in hierarchical trees , 1996, IEEE Trans. Circuits Syst. Video Technol..

[51]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[52]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[53]  Paul L. McCarley,et al.  Large-format variable spatial acuity superpixel imaging: visible and infrared systems applications , 2004, SPIE Defense + Commercial Sensing.

[54]  Anup Basu,et al.  Enhancing videoconferencing using spatially varying sensing , 1998, IEEE Trans. Syst. Man Cybern. Part A.

[55]  Laurent Itti,et al.  The role of memory in guiding attention during natural vision. , 2006, Journal of vision.

[56]  Marios S. Pattichis,et al.  Foveated video compression with optimal rate control , 2001, IEEE Trans. Image Process..

[57]  Ee-Chien Chang,et al.  A wavelet approach to foveating images , 1997, SCG '97.

[58]  S J Anderson,et al.  Peripheral spatial vision: limits imposed by optics, photoreceptors, and receptor pooling. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[59]  C. Enroth-Cugell,et al.  The contrast sensitivity of retinal ganglion cells of the cat , 1966, The Journal of physiology.

[60]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[61]  Zhou Wang,et al.  Embedded foveation image coding , 2001, IEEE Trans. Image Process..

[62]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[63]  L. Itti Quantitative modelling of perceptual salience at human eye position , 2006 .

[64]  Benjamin B. Bederson,et al.  Space variant image processing , 1994, International Journal of Computer Vision.

[65]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.