Application-aware video coding architecture using camera and object motion-models

The proliferation of video consumption, especially over mobile devices, has created a demand for efficient interactive video applications and high-level video analysis. This is particularly significant in real-time applications and resource-limited scenarios. Pixel-domain video processing is often inefficient for many of these applications due to its complexity, whereas compressed domain processing offer fast but unreliable results. In order to achieve fast and effective video processing, this paper proposes a novel video encoding architecture that facilitate efficient compressed domain processing, while maintaining compliance with the mainstream coding standards. This is achieved by optimizing the accuracy of motion information embedded in the compressed video, in addition to compression efficiency. In a motion detection application, we demonstrate that the motion estimated by the proposed encoder can be directly used to extract object information, as opposed to conventionally coded video. The incurred rate distortion overheads can be weighed against the reduced processing required for video analysis targeting a wide spectrum of computer vision applications.

[1]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[2]  King Ngi Ngan,et al.  Automatic video segmentation and tracking for content-based applications , 2007, IEEE Communications Magazine.

[3]  Athanassios N. Skodras,et al.  A new data hiding scheme for scene change detection in H.264 encoded video sequences , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[4]  Rae-Hong Park,et al.  Motion vector correction based on the pattern-like image analysis , 2003, IEEE Trans. Consumer Electron..

[5]  Gary J. Sullivan,et al.  Rate-constrained coder control and comparison of video coding standards , 2003, IEEE Trans. Circuits Syst. Video Technol..

[6]  Henri Nicolas,et al.  Joint global motion estimation and coding for scalable H.264/SVC high-definition video streams , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[7]  Wesley De Neve,et al.  Temporal Video Segmentation on H.264/AVC Compressed Bitstreams , 2007, MMM.

[8]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[9]  D. Marpe,et al.  The H.264/MPEG4 advanced video coding standard and its applications , 2006, IEEE Communications Magazine.

[10]  Michael G. Strintzis,et al.  Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Warnakulasuriya Anil Chandana Fernando,et al.  Facilitating motion-based vision applications by combined video analysis and coding , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Yong Man Ro,et al.  Video Event Filtering in Consumer Domain , 2007, IEEE Transactions on Broadcasting.

[13]  Huifang Sun,et al.  Compressed Domain Video Object Segmentation , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  R. Venkatesh Babu,et al.  Video object segmentation: a compressed domain approach , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  S. Gary,et al.  Joint Model Reference Encoding Methods and Decoding Concealment Methods , 2003 .