Efficient H.264 video coding with a working memory of objects

In this work, we investigate a working memory approach for efficient temporal prediction in H.264 video coding. After video frames are encoded, objects are extracted, analyzed, and indexed in a dynamic database which acts as a working memory for the H.264 video encoder. During the encoding process, objects with similar spatial characteristics are retrieved from the working memory and used for motion prediction of objects in the current video frame. This approach extends the multiple-frame estimation and provides a more generic framework for spatiotemporal prediction of video data. Our experimental results on surveillance video data demonstrate that the proposed approach is able to save the coding bit rate by up to 35% with a small computational overhead.

[1]  Shyh-Yih Ma,et al.  Analysis and reduction of reference frames for motion estimation in MPEG-4 AVC/JVT/H.264 , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[2]  Liang-Gee Chen,et al.  Analysis and complexity reduction of multiple reference frames motion estimation in H.264/AVC , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  P. W. Huang Indexing pictures by key objects for large-scale image databases , 1997, Pattern Recognit..

[4]  Marjorie Skubic,et al.  Gait analysis and validation using voxel data , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[5]  M. Omair Ahmad,et al.  A continuous tracking algorithm for long-term memory motion estimation [video coding] , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[6]  Duk-Gyoo Kim,et al.  Fast Multiple Reference Frame Selection Method Using Correlation of Sequence in JVT/H.264 , 2006, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[7]  Xiang Li,et al.  Fast multi-frame motion estimation algorithm with adaptive search strategies in H.264 , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Gary J. Sullivan,et al.  Rate-constrained coder control and comparison of video coding standards , 2003, IEEE Trans. Circuits Syst. Video Technol..

[9]  Xi Chen,et al.  Activity Analysis, Summarization, and Visualization for Indoor Human Activity Monitoring , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Michael Stonebraker,et al.  Chabot: Retrieval from a Relational Database of Images , 1995, Computer.

[11]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[12]  Che-Yu Chang,et al.  Fast Multi-Frame Motion Estimation and Mode Decision for H.264 Encoders , 2005, 2005 International Conference on Wireless Networks, Communications and Mobile Computing.

[13]  King-Sun Fu,et al.  Query-by-Pictorial-Example , 1980, IEEE Trans. Software Eng..

[14]  Thomas Wiegand,et al.  Long-term memory motion-compensated prediction , 1999, IEEE Trans. Circuits Syst. Video Technol..

[15]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[16]  Kannan Ramchandran,et al.  Multimedia Analysis and Retrieval System (MARS) Project , 1996, Data Processing Clinic.

[17]  Shih-Fu Chang,et al.  Tools and techniques for color image retrieval , 1996, Electronic Imaging.

[18]  Ram Nevatia,et al.  Automatic Tracking and Labeling of Human Activities in a Video Sequence , 2004 .

[19]  Pao-Chi Chang,et al.  Short/long-term motion vector prediction in multi-frame video coding system , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[20]  Tanvi Banerjee,et al.  Testing an in-home gait assessment tool for older adults , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[21]  Shih-Fu Chang,et al.  Single color extraction and image query , 1995, Proceedings., International Conference on Image Processing.

[22]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[23]  Ming-Ting Sun,et al.  Fast multiple reference frame motion estimation for H.264/AVC , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Calvin C. Gotlieb,et al.  Texture descriptors based on co-occurrence matrices , 1990, Comput. Vis. Graph. Image Process..

[25]  Spyridon K. Kapotas,et al.  FAST MULTIPLE REFERENCE FRAME SELECTION METHOD IN H . 264 VIDEO ENCODING , 2007 .

[26]  Michael J. Swain,et al.  WebSeer: An Image Search Engine for the World Wide Web , 1996 .

[27]  Shi-Kuo Chang,et al.  Image Information Systems: Where Do We Go From Here? , 1992, IEEE Trans. Knowl. Data Eng..

[28]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Jill M. Boyce,et al.  Weighted prediction in the H.264/MPEG AVC video coding standard , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[30]  Thomas S. Huang,et al.  Supporting similarity queries in MARS , 1997, MULTIMEDIA '97.

[31]  Hua Li,et al.  A Fast Multiple Reference Frame Selection Algorithm Based on H.264/AVC , 2007, Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007).

[32]  Xuelong Li,et al.  Insignificant shadow detection for video segmentation , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Zhihai He,et al.  A real-time system for in-home activity monitoring of elders , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[34]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[35]  Larry S. Davis,et al.  A Robust Background Subtraction and Shadow Detection , 1999 .

[36]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[37]  Shih-Fu Chang,et al.  Automated binary texture feature sets for image retrieval , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[38]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[39]  Thomas S. Huang,et al.  Modified Fourier Descriptors for Shape Representation - A Practical Approach , 1996 .

[40]  Zhi Liu,et al.  An Adaptive and Fast H.264 Multi-Frame Selection Algorithm Based on Information from Previous Searches , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[41]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[42]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Kai-Kuang Ma,et al.  A new diamond search algorithm for fast block-matching motion estimation , 2000, IEEE Trans. Image Process..