Analytical distortion aware video coding for computer based video analysis

With the development of artificial intelligence, more and more multimedia applications for various tasks have emerged in our daily life. Meanwhile, as one of the main information sources of the applications, a huge amount of video data has been being generated by portable or mounted cameras in daily basis for varying purposes including surveillance, in which case we may need computers to "watch" videos to save labor cost. However, most video coding standards are designed for the highest human perceptual quality given a bit rate by minimizing a fidelity cost function (e.g., mean squared error, MSE), assuming the content will be consumed by human beings. In view of the above considerations, this paper proposes a new rate-analytical-distortion optimization method (RADO) for video analysis. Specifically, we consider moving object detection as the analysis task. Accordingly, we develop a novel rate analytical distortion (RAD) model for video coding, where the analytical distortion is related to the object detection performance expressed in terms of F-measure. As shown in the experimental results, the performance of the video analysis task can be significantly improved (up to 40% reduction of analytical distortion) with a slight bit rate increase.

[1]  Stefano Tubaro,et al.  Coding Visual Features Extracted From Video Sequences , 2014, IEEE Transactions on Image Processing.

[2]  Pascal Frossard,et al.  Semantic Coding by Supervised Dimensionality Reduction , 2008, IEEE Transactions on Multimedia.

[3]  King Ngi Ngan,et al.  Perceptual adaptive Lagrangian multiplier for high efficiency video coding , 2013, 2013 Picture Coding Symposium (PCS).

[4]  Herbert Gish,et al.  Asymptotically efficient quantizing , 1968, IEEE Trans. Inf. Theory.

[5]  Shuai Li,et al.  Lagrangian Multiplier Adaptation for Rate-Distortion Optimization With Inter-Frame Dependency , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Harvey J. Everett Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources , 1963 .

[7]  Antonio Ortega,et al.  Rate-distortion methods for image and video compression , 1998, IEEE Signal Process. Mag..

[8]  Yu Chen,et al.  An Analysis-Oriented ROI Based Coding Approach on Surveillance Video Data , 2016, PCM.

[9]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[10]  Thierry Chateau,et al.  A benchmark for Background Subtraction Algorithms in monocular vision: A comparative study , 2010, 2010 2nd International Conference on Image Processing Theory, Tools and Applications.

[11]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[12]  Eckehard G. Steinbach,et al.  Keypoint Encoding for Improved Feature Extraction From Compressed Video at Low Bitrates , 2015, IEEE Transactions on Multimedia.

[13]  Bin Li,et al.  QP refinement according to Lagrange multiplier for High Efficiency Video Coding , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[14]  Wei Tsang Ooi,et al.  Critical video quality for distributed automated video surveillance , 2005, MULTIMEDIA '05.

[15]  Nishu Singla Motion Detection Based on Frame Difference Method , 2014 .

[16]  João Ascenso,et al.  Coding binary local features extracted from video sequences , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[17]  Shuai Li,et al.  Temporally Dependent Rate-Distortion Optimization for Low-Delay Hierarchical Video Coding , 2017, IEEE Transactions on Image Processing.

[18]  Xuan Jing,et al.  Face Region Based Conversational Video Coding , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Marco Tagliasacchi,et al.  Compress-then-analyze vs. analyze-then-compress: Two paradigms for image analysis in visual sensor networks , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).