Spatio-temporal consistency in video disparity estimation

We present a novel stereo video disparity estimation method. The proposed method is a two-stage algorithm. During the first stage, initial disparity maps are computed in a frame-by-frame basis. In the second stage, the initial estimates are treated as a space-time volume. By setting up an l1-normed minimization problem with a novel three-dimensional total variation regularization, spatial smoothness and temporal consistency are handled simultaneously. Due to our unique formulation, any existing image disparity estimation technique may utilize our method as a post-processing step to refine noisy estimates or to be extended to videos. The proposed method shows superior speed, accuracy, and consistency compared to state-of-the-art algorithms.

[1]  Neil A. Dodgson,et al.  Real-Time Spatiotemporal Stereo Matching Using the Dual-Cross-Bilateral Grid , 2010, ECCV.

[2]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[3]  Ruigang Yang,et al.  Fusion of time-of-flight depth and stereo for high accuracy depth maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Li Shi,et al.  Theory and experiment analysis of disparity for stereoscopic image pairs , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[5]  Frederic Devernay,et al.  A Variational Method for Scene Flow Estimation from Stereo Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Pedro F. Felzenszwalb,et al.  Efficient belief propagation for early vision , 2004, CVPR 2004.

[7]  W. Marsden I and J , 2012 .

[8]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Hujun Bao,et al.  Consistent Depth Maps Recovery from a Video Sequence , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  VekslerOlga,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001 .

[12]  Michael Isard,et al.  Estimating disparity and occlusions in stereo video sequences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  M. Gelautz,et al.  Temporally consistent disparity maps from uncalibrated stereo videos , 2009, 2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis.

[14]  Truong Q. Nguyen,et al.  An Augmented Lagrangian Method for Total Variation Video Restoration , 2011, IEEE Transactions on Image Processing.

[15]  Vladimir Kolmogorov,et al.  Computing visual correspondence with occlusions using graph cuts , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Kuk-Jin Yoon,et al.  Locally adaptive support-weight approach for visual correspondence search , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).