A computation approach to the minimum total rate problem of causal video coding

Causal video coding is considered from an information theoretic point of view, where video source frames X<inf>1</inf>,X<inf>2</inf>, ≥≥≥, X<inf>N</inf> are encoded in a frame by frame manner, the encoder for each frame X<inf>k</inf>, k = 1, ≥≥≥, N, can use all previous frames and all previous encoded frames while the corresponding decoder can use only all previous encoded frames, and each frame Xk itself is modeled as a source X<inf>k</inf> = {X<inf>k</inf>(i)}<inf>i=1</inf><sup>∞</sup>. A novel computation approach is proposed to analytically characterize and numerically compute the minimum total rate R<inf>c</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) required to achieve a given distortion (quality) level D<inf>1</inf>, ≥≥≥, D<inf>N</inf> ≥ 0. Specifically, we first show that for jointly stationary ergodic sources X<inf>1</inf>, X<inf>2</inf>, ≥≥≥, X<inf>N</inf>, R<inf>c</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) is equal to the infimum of the nth order total rate distortion function R<inf>c,n</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) over all n, where R<inf>c,n</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) itself is given by the minimum of an information quantity over a set of auxiliary random variables. We then present an iterative algorithm for computing R<inf>c,n</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) and demonstrate the convergence of the algorithm to the global minimum. The global convergence of the algorithm further enables us to establish a single-letter characterization of R<inf>c</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) in a novel way when the N sources are an independent and identically distributed vector source. Deep insights from the algorithm are also gained regarding how each frame should be encoded in order to achieve R<inf>c</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>); it is demonstrated by example that R<inf>c</inf>(D<inf>1</inf>, ≥≥≥, D<inf>N</inf>) is in general much smaller than the total rate offered by the traditional greedy coding method by which each frame is encoded in a local optimum manner based on all information available to the encoder of the frame. In addition, a tight achievable rate distortion region is also derived.

[1]  Zhen Zhang,et al.  Rate Distortion Theory for Causal Video Coding: Characterization, Computation Algorithm, and Comparison , 2011, IEEE Transactions on Information Theory.

[2]  Richard E. Blahut,et al.  Computation of channel capacity and rate-distortion functions , 1972, IEEE Trans. Inf. Theory.

[3]  Toby Berger,et al.  Sequential coding of correlated sources , 2000, IEEE Trans. Inf. Theory.

[4]  Prakash Ishwar,et al.  On Delayed Sequential Coding of Correlated Sources , 2011, IEEE Transactions on Information Theory.

[5]  En-Hui Yang,et al.  Rate Distortion Optimization for H.264 Interframe Coding: A General Framework and Algorithms , 2007, IEEE Transactions on Image Processing.