Detecting intended human objects in human-captured videos

When people take videos, they always want to capture intended objects, which are essential for presenting what they want to express in their videos, and to share the intended objects with others. The concept of intended objects provide a novel perspective for video content analysis, and detecting intended objects may be beneficial for wide range of applications such as video understanding and semantics interpretation, video summarization, video adaptation, video privacy protection, and so on. In this paper, we focus on a particular type of intended objects, i.e., intended human objects, and an interesting method is developed for detecting intended human objects automatically from human-captured videos. We also investigate the correlation between intended human objects and visual attention. Our experimental results indicate that our method can successfully detect the intended human objects.

[1]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[2]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[3]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[4]  Wei-Ying Ma,et al.  A content-based bit allocation model for video streaming , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[5]  Liang-Tien Chia,et al.  Attention-from-motion: A factorization approach for detecting attention objects in motion , 2009, Comput. Vis. Image Underst..

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Lie Lu,et al.  Optimization-based automated home video editing system , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Tao Mei,et al.  Tracking users' capture intention: a novel complementary view for home video content analysis , 2005, MULTIMEDIA '05.

[9]  HongJiang Zhang,et al.  Contrast-based image attention analysis by using fuzzy growing , 2003, MULTIMEDIA '03.

[10]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[11]  Frédéric Dufaux,et al.  Efficient, robust, and fast global motion estimation for video coding , 2000, IEEE Trans. Image Process..

[12]  Tao Mei,et al.  Modeling and Mining of Users' Capture Intention for Home Videos , 2007, IEEE Transactions on Multimedia.

[13]  Michael Gleicher,et al.  Video retargeting: automating pan and scan , 2006, MM '06.

[14]  Hua-Tsung Chen,et al.  Content-Aware Video Adaptation under Low-Bitrate Constraint , 2007, EURASIP J. Adv. Signal Process..

[15]  Tao Mei,et al.  Multi-video synopsis for video representation , 2009, Signal Process..

[16]  Xing Xie,et al.  Looking into video frames on small displays , 2003, ACM Multimedia.

[17]  Laurent Itti,et al.  Interesting objects are visually salient. , 2008, Journal of vision.

[18]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[19]  Tao Mei,et al.  Video Collage: A Novel Presentation of Video Sequence , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[20]  Hiroshi Murase,et al.  Fast visual search using focused color matching—active search , 2000 .