Perception of differences in naturalistic dynamic scenes, and a V1-based model.

We investigate whether a computational model of V1 can predict how observers rate perceptual differences between paired movie clips of natural scenes. Observers viewed 198 pairs of movies clips, rating how different the two clips appeared to them on a magnitude scale. Sixty-six of the movie pairs were naturalistic and those remaining were low-pass or high-pass spatially filtered versions of those originals. We examined three ways of comparing a movie pair. The Spatial Model compared corresponding frames between each movie pairwise, combining those differences using Minkowski summation. The Temporal Model compared successive frames within each movie, summed those differences for each movie, and then compared the overall differences between the paired movies. The Ordered-Temporal Model combined elements from both models, and yielded the single strongest predictions of observers' ratings. We modeled naturalistic sustained and transient impulse functions and compared frames directly with no temporal filtering. Overall, modeling naturalistic temporal filtering improved the models' performance; in particular, the predictions of the ratings for low-pass spatially filtered movies were much improved by employing a transient impulse function. The correlations between model predictions and observers' ratings rose from 0.507 without temporal filtering to 0.759 (p = 0.01%) when realistic impulses were included. The sustained impulse function and the Spatial Model carried more weight in ratings for normal and high-pass movies, whereas the transient impulse function with the Ordered-Temporal Model was most important for spatially low-pass movies. This is consistent with models in which high spatial frequency channels with sustained responses primarily code for spatial details in movies, while low spatial frequency channels with transient responses code for dynamic events.

[1]  Damon M. Chandler,et al.  ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices , 2014, J. Electronic Imaging.

[2]  John H. R. Maunsell,et al.  How parallel are the primate visual pathways? , 1993, Annual review of neuroscience.

[3]  R. Malach,et al.  Intersubject Synchronization of Cortical Activity During Natural Vision , 2004, Science.

[4]  E. Peli Contrast in complex images. , 1990, Journal of the Optical Society of America. A, Optics and image science.

[5]  James Hu,et al.  DVQ: A digital video quality metric based on human vision , 2001 .

[6]  T. Meese Area summation and masking. , 2004, Journal of vision.

[7]  Eero P. Simoncelli,et al.  A model of neuronal responses in visual area MT , 1998, Vision Research.

[8]  D. Tolhurst,et al.  Magnitude of perceived change in natural images may be linearly proportional to differences in neuronal firing rates. , 2010, Seeing and perceiving.

[9]  J. Enns,et al.  Spatial selection and target identification are separable processes in visual search. , 2010, Journal of vision.

[10]  J A Solomon,et al.  Model of visual contrast gain control and pattern masking. , 1997, Journal of the Optical Society of America. A, Optics, image science, and vision.

[11]  Laurent Itti,et al.  Realistic avatar eye and head animation using a neurobiological model of visual attention , 2004, SPIE Optics + Photonics.

[12]  P. Lennie,et al.  Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. , 1984, The Journal of physiology.

[13]  D. Tolhurst Separate channels for the analysis of the shape and the movement of a moving visual stimulus , 1973, The Journal of physiology.

[14]  D. Tolhurst,et al.  On the variety of spatial frequency selectivities shown by neurons in area 17 of the cat , 1981, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[15]  E. Callaway,et al.  Multiple Circuits Relaying Primate Parallel Visual Pathways to the Middle Temporal Area , 2006, The Journal of Neuroscience.

[16]  D. Tolhurst,et al.  Summation of perceptual cues in natural visual scenes , 2008, Proceedings of the Royal Society B: Biological Sciences.

[17]  D. Tolhurst Sustained and transient channels in human vision , 1975, Vision Research.

[18]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.

[19]  Tom Troscianko,et al.  Search for gross illumination discrepancies in images of natural objects. , 2009, Journal of vision.

[20]  Taihei Ninomiya,et al.  Differential architecture of multisynaptic geniculo-cortical pathways to V4 and MT. , 2011, Cerebral cortex.

[21]  J. Kulikowski,et al.  Convergence of parvocellular and magnocellular information channels in the primary visual cortex of the macaque , 2002, The European journal of neuroscience.

[22]  Brian A. Wandell,et al.  Two temporal channels in human V1 identified using fMRI , 2009, NeuroImage.

[23]  E. Callaway,et al.  Parallel processing strategies of the primate visual system , 2009, Nature Reviews Neuroscience.

[24]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[25]  Zhou Wang,et al.  Video quality assessment using a statistical model of human visual speed perception. , 2007, Journal of the Optical Society of America. A, Optics, image science, and vision.

[26]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[27]  D. Tolhurst,et al.  Perception of suprathreshold naturalistic changes in colored natural images. , 2010, Journal of vision.

[28]  P. Gouras Identification of cone mechanisms in monkey ganglion cells , 1968, The Journal of physiology.

[29]  E. Callaway,et al.  Convergence of magno- and parvocellular pathways in layer 4B of macaque primary visual cortex , 1996, Nature.

[30]  T. Smith,et al.  Attentional synchrony and the influence of viewing task on gaze behavior in static and dynamic scenes. , 2013, Journal of vision.

[31]  C. Blakemore,et al.  Lateral inhibition between orientation detectors in the cat's visual cortex , 2004, Experimental Brain Research.

[32]  D. Tolhurst,et al.  Psychophysical evidence for sustained and transient detectors in human vision , 1973, The Journal of physiology.

[33]  D. Tolhurst,et al.  Calculating the contrasts that retinal ganglion cells and LGN neurones encounter in natural scenes , 2000, Vision Research.

[34]  A B Watson,et al.  Efficiency of a model human image code. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[35]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[36]  R. F. Hess,et al.  Temporal properties of human visual filters: number, shapes and spatial covariation , 1992, Vision Research.

[37]  Alan Kennedy,et al.  Perception and memory across viewpoint changes in moving images. , 2010, Journal of vision.

[38]  D. Tolhurst,et al.  Discrimination of changes in the second-order statistics of natural and synthetic images , 1994, Vision Research.

[39]  J. M. Foley,et al.  Human luminance pattern-vision mechanisms: masking experiments require a new model. , 1994, Journal of the Optical Society of America. A, Optics, image science, and vision.

[40]  Andrew B. Watson,et al.  Toward a perceptual video-quality metric , 1998, Electronic Imaging.

[41]  Tom Troscianko,et al.  Perception while watching movies: Effects of physical screen size and scene type , 2012, i-Perception.

[42]  John A. Perrone,et al.  A visual motion sensor based on the properties of V1 and MT neurons , 2004, Vision Research.

[43]  Kowa Koida,et al.  Color vision test for dichromatic and trichromatic macaque monkeys. , 2013, Journal of vision.

[44]  R J Baddeley,et al.  A general rule for sensory cue summation: evidence from photographic, musical, phonetic and cross-modal stimuli , 2011, Proceedings of the Royal Society B: Biological Sciences.

[45]  P. McOwan,et al.  A computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells , 1992, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[46]  U. T. Keesey Flicker and pattern detection: a comparison of thresholds. , 1972, Journal of the Optical Society of America.