On Learning Conditional Random Fields for Stereo

Until recently, the lack of ground truth data has hindered the application of discriminative structured prediction techniques to the stereo problem. In this paper we use ground truth data sets that we have recently constructed to explore different model structures and parameter learning techniques. To estimate parameters in Markov random fields (MRFs) via maximum likelihood one usually needs to perform approximate probabilistic inference. Conditional random fields (CRFs) are discriminative versions of traditional MRFs. We explore a number of novel CRF model structures including a CRF for stereo matching with an explicit occlusion model. CRFs require expensive inference steps for each iteration of optimization and inference is particularly slow when there are many discrete states. We explore belief propagation, variational message passing and graph cuts as inference methods during learning and compare with learning via pseudolikelihood. To accelerate approximate inference we have developed a new method called sparse variational message passing which can reduce inference time by an order of magnitude with negligible loss in quality. Learning using sparse variational message passing improves upon previous approaches using graph cuts and allows efficient learning over large data sets when energy functions violate the constraints imposed by graph cuts.

[1]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[2]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  D. Nistér,et al.  Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Li Hong,et al.  Segment-based stereo matching using graph cuts , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[5]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Andrew Blake,et al.  Probabilistic Fusion of Stereo with Color and Contrast for Bilayer Segmentation , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[8]  Mark W. Schmidt,et al.  Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[9]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[10]  Christopher Joseph Pal,et al.  Efficiently Learning Random Fields for Stereo Vision with Sparse Message Passing , 2008, ECCV.

[11]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  David A. McAllester,et al.  Unsupervised Learning of Stereo Vision with Monocular Depth Cues , 2009, BMVC.

[13]  Li Zhang,et al.  Parameter estimation for MRF stereo , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Jian Sun,et al.  Symmetric stereo matching for occlusion handling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Martin J. Wainwright,et al.  Tree-based reparameterization framework for analysis of sum-product and related algorithms , 2003, IEEE Trans. Inf. Theory.

[17]  A. McCallum,et al.  Sign detection in natural images with conditional random fields , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[18]  Luc Van Gool,et al.  Wide-baseline stereo from multiple views: A probabilistic account , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[20]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Vladimir Kolmogorov,et al.  Multi-camera Scene Reconstruction via Graph Cuts , 2002, ECCV.

[22]  Brendan J. Frey,et al.  A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[23]  Michael I. Jordan,et al.  An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators , 2008, ICML '08.

[24]  Hoang Trinh,et al.  TRINH , MCALLESTER : UNSUPERVISED LEARNING FOR STEREO 1 Unsupervised Learning of Stereo Vision with Monocular Cues , 2009 .

[25]  Brendan J. Frey,et al.  A comparison of algorithms for inference and learning in probabilistic graphical models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Christopher Joseph Pal,et al.  Sparse Forward-Backward Using Minimum Divergence Beams for Fast Training Of Conditional Random Fields , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[27]  Rachid Deriche,et al.  Dense Disparity Map Estimation Respecting Image Discontinuities: A PDE and Scale-Space BasedApproach , 2002, MVA.

[28]  Vladimir Kolmogorov,et al.  Computing visual correspondence with occlusions using graph cuts , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[29]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[30]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[31]  Ruigang Yang,et al.  Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation and Occlusion Handling , 2006, CVPR.

[32]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[33]  Margrit Gelautz,et al.  A layered stereo algorithm using image segmentation and global visibility constraints , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[34]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[35]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[36]  William T. Freeman,et al.  Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[37]  Allen R. Hanson,et al.  Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[39]  Hai Tao,et al.  A method for learning matching errors for stereo computation , 2004, BMVC.

[40]  Long Quan,et al.  Region-based progressive stereo matching , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[41]  Chris Pal,et al.  Sparse Message Passing and Efficiently Learning Random Fields for Stereo Vision , 2007 .

[42]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[43]  Ye Zhang,et al.  Stereo Matching with Segmentation-Based Cooperation , 2002, ECCV.

[44]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[45]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[46]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Carlo Tomasi,et al.  A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[49]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[50]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[53]  Ben Taskar,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[54]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[55]  Luc Van Gool,et al.  Dense matching of multiple wide-baseline views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[56]  Stephen T. Barnard,et al.  Stochastic stereo matching over scale , 1989, International Journal of Computer Vision.