In order to allow sufficient amount of light into the image sensor, videos captured in poor lighting conditions typically have low frame rate and frame exposure time equals to inter-frame period — commonly called full exposure time (FET). FET low-frame-rate videos are common in situations where lighting cannot be improved a priori due to practical (e.g., large physical distance between camera and captured objects) or economical (e.g., long duration of nighttime surveillance) reasons. Previous computer vision work has shown that content at a desired higher frame rate can be recovered (to some degree of precision) from the captured FET video using self-similarity-based temporal super-resolution. For a network streaming scenario, where a client receives a FET video stream from a server and plays back in real-time, the following practical question remains, however: what is the most suitable representation of the captured FET video at encoder, given that a video at higher frame rate must be constructed at the decoder at low complexity? In this paper, we present an adaptive frame and quantization parameter (QP) selection strategy, where, for a given targeted rate-distortion (RD) tradeoff, FET video frames at appropriate temporal resolutions and QP are selected for encoding using standard H.264 tools at encoder. At the decoder, temporal super-resolution is performed at low complexity on the decoded frames to synthesize the desired high frame rate video for display in real-time. We formulate the selection of individual FET frames at different temporal resolutions and QP as a shortest path problem to minimize Lagrangian cost of the encoded sequence. Then, we propose a computation-efficient algorithm based on monotonicity in predictor's temporal resolution and QP to find the shortest path. Experiments show that our strategy outperforms alternative na¨ıve non-adaptive approaches by up to 1.3dB at the same bitrate.
[1]
Antonio Ortega,et al.
Bit allocation for dependent quantization with applications to multiresolution and MPEG video coders
,
1994,
IEEE Trans. Image Process..
[2]
Thomas Brox,et al.
High Accuracy Optical Flow Estimation Based on a Theory for Warping
,
2004,
ECCV.
[3]
Shan Liu,et al.
Joint temporal-spatial bit allocation for video coding with dependency
,
2005,
IEEE Transactions on Circuits and Systems for Video Technology.
[4]
Alin Achim,et al.
18th IEEE International Conference on Image Processing, ICIP 2011, Brussels, Belgium, September 11-14, 2011
,
2011,
ICIP.
[5]
Imari Sato,et al.
Compression using self-similarity-based temporal super-resolution for full-exposure-time video
,
2011,
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6]
Takahiro Okabe,et al.
Video Temporal Super-Resolution Based on Self-similarity
,
2010,
ACCV.
[7]
Gene Cheung,et al.
Efficient bit allocation for multiview image coding & view synthesis
,
2010,
2010 IEEE International Conference on Image Processing.
[8]
Takahiro Okabe,et al.
Video Temporal Super-resolution Based on Self-similarity
,
2013,
Advanced Topics in Computer Vision.
[9]
Jiang Li,et al.
A low complexity motion compensated frame interpolation method
,
2005,
2005 IEEE International Symposium on Circuits and Systems.