Robust Region-of-Interest Determination Based on User Attention Model Through Visual Rhythm Analysis

This paper investigates a user attention model based on the visual rhythm analysis for automatically determining the region-of-interest (ROI) in a video. The visual rhythm, an abstraction of a video, is a thumbnail version of a fully video by a 2D image that captures the temporal information of a video sequence. Four sampling lines, including diagonal, anti-diagonal, vertical and horizontal lines, are employed to obtain four visual rhythm maps in order to analyze the location of the ROI from video data. Via the variation on visual rhythms, object and camera motions can be efficiently distinguished. The proposed scheme can extract the ROI accurately with very low computational complexity. The promising results from the experiments demonstrate that the moving object is effectively and efficiently extracted.

[1]  KeDai Zhang,et al.  Automatic Salient Regions of Interest Extraction Based on Edge and Region Integration , 2006, 2006 IEEE International Symposium on Industrial Electronics.

[2]  Aniruddha Sinha,et al.  A fast algorithm to find the region-of-interest in the compressed MPEG domain , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[3]  Michael G. Strintzis,et al.  Face localization and authentication using color and depth images , 2005, IEEE Transactions on Image Processing.

[4]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[5]  Homer H. Chen,et al.  Frame-Layer Constant-Quality Rate Control of Regions of Interest for Multiple Encoders With Single Video Source , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Nikil Jayant,et al.  Optimizing Algorithms for Region-of-Interest Video Compression, with Application to Mobile Telehealth , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[7]  Chia-Hung Yeh,et al.  Region-of-interest video coding based on rate and distortion variations for H.263+ , 2008, Signal Process. Image Commun..

[8]  Kang-Hyun Jo,et al.  Color-based Face Detection using Combination of Modified Local Binary Patterns and embedded Hidden Markov Models , 2006, 2006 SICE-ICASE International Joint Conference.

[9]  Jinho Lee,et al.  An efficient graphical shot verifier incorporating visual rhythm , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[10]  Mika Laaksonen,et al.  Skin detection in video under changing illumination conditions , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[11]  Chia-Hung Yeh,et al.  Robust Region-of-Interest Determination Based on User Attention Model Through Visual Rhythm Analysis , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Ming-Chieh Chi,et al.  ROI video coding based on H.263+ with robust skin-color detection technique , 2003, IEEE Trans. Consumer Electron..

[13]  Zhengguo Li,et al.  Conversational Video Communication of H.264/AVC with Region-of-Interest Concern , 2006, 2006 International Conference on Image Processing.

[14]  Francisco Nivando Bezerra,et al.  Video transition detection using string matching: preliminary results , 2003, 16th Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2003).

[15]  Arnaldo de Albuquerque Araújo,et al.  An approach to detect video transitions based on mathematical morphology , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[16]  Wen-Huang Cheng,et al.  Automatic video region-of-interest determination based on user attention model , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[17]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[18]  Minho Lee,et al.  A Region of Interest Based Image Segmentation Method using a Biologically Motivated Selective Attention Model , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[19]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[20]  Marios S. Pattichis,et al.  Foveated video quality assessment , 2002, IEEE Trans. Multim..

[21]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[22]  Arnaldo de Albuquerque Araújo,et al.  Video fade detection by discrete line identification , 2002, Object recognition supported by user interaction for service robots.

[23]  Jack Y. B. Lee On a unified architecture for video-on-demand services , 2002, IEEE Trans. Multim..

[24]  N Otsu,et al.  An automatic threshold selection method based on discriminate and least squares criteria , 1979 .

[25]  Wen-Huang Cheng,et al.  A user-attention based focus detection framework and its applications , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.