论文信息 - Foveated shot detection for video segmentation

Foveated shot detection for video segmentation

We view scenes in the real world by moving our eyes three to four times each second and integrating information across subsequent fixations (foveation points). By taking advantage of this fact, in this paper we propose an original approach to partitioning of a video into shots based on a foveated representation of the video. More precisely, the shot-change detection method is related to the computation, at each time instant, of a consistency measure of the fixation sequences generated by an ideal observer looking at the video. The proposed scheme aims at detecting both abrupt and gradual transitions between shots using a single technique, rather than a set of dedicated methods. Results on videos of various content types are reported and validate the proposed approach.

[1] Ramesh C. Jain,et al. A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video , 2002, Pattern Recognit..

[2] A. L. Yarbus,et al. Eye Movements and Vision , 1967, Springer US.

[3] Joachim Denzler,et al. Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Michael J. Swain,et al. Color indexing , 1991, International Journal of Computer Vision.

[5] S. Mallat. A wavelet tour of signal processing , 1998 .

[6] Peter J. Burt. A pyramid-based front-end processor for dynamic vision applications , 2002 .

[7] Allan D. Jepson,et al. Priors, preferences and categorical percepts , 1996 .

[8] Atreyi Kankanhalli,et al. Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[9] L. Stark,et al. Scanpaths in saccadic eye movements while viewing and recognizing patterns. , 1971, Vision research.

[10] Borko Furht,et al. Video and Image Processing in Multimedia Systems , 1995 .

[11] Ullas Gargi,et al. Performance characterization of video-shot-change detection methods , 2000, IEEE Trans. Circuits Syst. Video Technol..

[12] Rajesh P. N. Rao,et al. Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex , 1997, Neural Computation.

[13] C. Koch,et al. Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[14] Gerhard Krieger,et al. Scene analysis with saccadic eye movements: Top-down and bottom-up modeling , 2001, J. Electronic Imaging.

[15] A. Murat Tekalp,et al. Digital Video Processing , 1995 .

[16] Akio Nagasaka,et al. Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[17] G. Hauske,et al. Object and scene analysis by saccadic eye-movements: an investigation with higher-order statistics. , 2000, Spatial vision.

[18] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.

[19] Rainer Lienhart,et al. Reliable Transition Detection in Videos: A Survey and Practitioner's Guide , 2001, Int. J. Image Graph..

[20] Brian Scassellati,et al. Humanoid Robots: A New Kind of Tool , 2000, IEEE Intell. Syst..

[21] A. Yuille,et al. Bayesian decision theory and psychophysics , 1996 .

[22] Hanspeter A. Mallot,et al. Saccadic object recognition with an active vision system , 1992, [1992] Proceedings. 11th IAPR International Conference on Pattern Recognition.

[23] Warnakulasuriya Anil Chandana Fernando,et al. Fade and dissolve detection in uncompressed and compressed video sequences , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[24] King Ngi Ngan,et al. Unsupervised Segmentation of Defocused Video Based on Matting Model , 2006, 2006 International Conference on Image Processing.

[25] David G. Stork,et al. Pattern Classification , 1973 .

[26] Behzad Shahraray,et al. Scene change detection and content-based sampling of video sequences , 1995, Electronic Imaging.

[27] Thierry Pun,et al. Attentive mechanisms for dynamic and static scene analysis , 1995 .

[28] Alan Hanjalic,et al. Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[29] Pietro Perona,et al. Overcomplete steerable pyramid filters and rotation invariance , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[30] Rajesh P. N. Rao,et al. An optimal estimation approach to visual perception and learning , 1999, Vision Research.

[31] J. O'Regan,et al. Solving the "real" mysteries of visual perception: the world as an outside memory. , 1992, Canadian journal of psychology.

[32] Terry Caelli,et al. Entropy-based representation of image information , 2002, Pattern Recognit. Lett..

[33] Rainer Lienhart,et al. Comparison of automatic shot boundary detection algorithms , 1998, Electronic Imaging.

[34] Kikukawa Takeshi,et al. Development of an Automatic Summary Editing System for the Audio Visual Resources. , 1992 .

[35] Edward H. Adelson,et al. The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[36] Yoshinobu Tonomura,et al. Video browsing using brightness data , 1991, Other Conferences.

[37] J. Deutsch. Perception and Communication , 1958, Nature.

[38] Claudio M. Privitera,et al. Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[39] Ramesh C. Jain,et al. Digital video segmentation , 1994, MULTIMEDIA '94.

[40] Nuno Vasconcelos,et al. Statistical models of video structure for content analysis and characterization , 2000, IEEE Trans. Image Process..

[41] A. L. I︠A︡rbus. Eye Movements and Vision , 1967 .

[42] Rainer Lienhart,et al. Reliable dissolve detection , 2001, IS&T/SPIE Electronic Imaging.

[43] John K. Tsotsos,et al. Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[44] Ronald A. Rensink,et al. TO SEE OR NOT TO SEE: The Need for Attention to Perceive Changes in Scenes , 1997 .

[45] C. L. M.. The Psychology of Attention , 1890, Nature.

[46] M. Goodale,et al. The objects of action and perception , 1998, Cognition.

[47] A. Berthoz,et al. Le sens du mouvement , 1997 .

[48] John A. Barnden,et al. Temporal winner-take-all networks: a time-based mechanism for fast selection in neural networks , 1993, IEEE Trans. Neural Networks.

[49] Giuseppe Boccignone,et al. Modelling gaze shift as a constrained random walk , 2004 .

[50] G R Loftus,et al. The functional visual field during picture viewing. , 1980, Journal of experimental psychology. Human learning and memory.

[51] Christopher M. Brown,et al. Controlling eye movements with hidden Markov models , 2004, International Journal of Computer Vision.

[52] Hanqing Lu,et al. Model based video segmentation , 2000, 2000 IEEE Workshop on SiGNAL PROCESSING SYSTEMS. SiPS 2000. Design and Implementation (Cat. No.00TH8528).

[53] Marios S. Pattichis,et al. Foveated video compression with optimal rate control , 2001, IEEE Trans. Image Process..

[54] Derrick J. Parkhurst,et al. Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[55] Arnaud Delorme,et al. Spike-based strategies for rapid processing , 2001, Neural Networks.

[56] Lawrence W. Stark,et al. Top-down guided eye movements , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[57] F. H. Qi,et al. A novel video key frame extraction algorithm , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[58] Ramin Zabih,et al. A feature-based algorithm for detecting and classifying scene breaks , 1995, MULTIMEDIA '95.

[59] Boon-Lock Yeo,et al. Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[60] Bärbel Mertsching,et al. Data- and Model-Driven Gaze Control for an Active-Vision System , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[61] Ba Tu Truong,et al. New enhancements to cut, fade, and dissolve detection processes in video segmentation , 2000, ACM Multimedia.

[62] Yanjun Qi,et al. Supervised classification for video shot segmentation , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).