Dominant and multiple motion estimation for video representation

The major inhibitors of rapid access to online video data are costs and management of capture and storage, lack of high-speed real-time delivery and non-availability of content and context based intelligent search and indexing techniques. The solutions for capture, storage and delivery maybe on the horizon, however the lack of visual content based indexing of video and image information may still inhibit as widespread a use of this information modality as that of text or tabular data is currently. We present techniques for compact visual representation of video data that will be useful for visual content based presentation and indexing. Video data comes in torrents-almost a megabyte every 30th of a second-but also affords the exploitation of relatively smoothly changing information over time. The techniques presented exploit the motion information across video frames to represent the underlying scene in a compact visual form as it is seen across many slowly varying frames in a video. Two classes of techniques are presented: (i) dominant motion estimation based techniques which exploit a fairly common occurrence in videos that a mostly fixed background (scene) is imaged with or without independently moving objects, and (ii) simultaneous multiple motion estimation and representation of motion video using layered representations.

[1]  Harpreet S. Sawhney,et al.  Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding , 1995, Proceedings of IEEE International Conference on Computer Vision.

[2]  Harpreet S. Sawhney,et al.  Model-based 2D&3D dominant motion estimation for mosaicing and video representation , 1995, Proceedings of IEEE International Conference on Computer Vision.

[3]  Josef Bigün,et al.  Segmentation of moving objects by robust motion parameter estimation over multiple frames , 1994, ECCV.

[4]  Yoshinobu Tonomura,et al.  VideoMAP and VideoSpaceIcon: tools for anatomizing video content , 1993, INTERCHI.

[5]  Edward H. Adelson,et al.  Layered representation for motion analysis , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Walter Bender,et al.  Salient video stills: content and context preserved , 1993, MULTIMEDIA '93.

[7]  P. Anandan,et al.  Accurate computation of optical flow by using layered motion representations , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[8]  Richard Szeliski,et al.  Image mosaicing for tele-reality applications , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[9]  Steve Mann,et al.  Virtual bellows: constructing high quality stills from video , 1994, Proceedings of 1st International Conference on Image Processing.

[10]  A. Pentland,et al.  Robust estimation of a multi-layered motion representation , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[11]  Michal Irani,et al.  Detecting and Tracking Multiple Moving Objects Using Temporal Integration , 1992, ECCV.