Fast Approximate Energy Minimization via Graph Cuts

Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

[1]  R. B. Potts Some generalized order-disorder transformations , 1952, Mathematical Proceedings of the Cambridge Philosophical Society.

[2]  D. R. Fulkerson,et al.  Flows in Networks. , 1964 .

[3]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[5]  Steven W. Zucker,et al.  On the Foundations of Relaxation Labeling Processes , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Tomaso Poggio,et al.  Computational vision and regularization theory , 1985, Nature.

[8]  W. Eric L. Grimson,et al.  Discontinuity detection for visual surface reconstruction , 1985, Comput. Vis. Graph. Image Process..

[9]  Demetri Terzopoulos,et al.  Regularization of Inverse Visual Problems Involving Discontinuities , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[11]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[12]  Wang,et al.  Nonuniversal critical dynamics in Monte Carlo simulations. , 1987, Physical review letters.

[13]  David Lee,et al.  One-Dimensional Regularization with Discontinuities , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  T Poggio,et al.  Parallel integration of vision modules. , 1988, Science.

[15]  G. Parisi,et al.  Statistical Field Theory , 1988 .

[16]  Andrew Blake,et al.  Comparison of the Efficiency of Deterministic and Stochastic Algorithms for Visual Reconstruction , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[18]  Ramesh C. Jain,et al.  Using Dynamic Programming for Solving Variational Problems in Vision , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Donald Geman,et al.  Boundary Detection by Constrained Optimization , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Mihalis Yannakakis,et al.  The complexity of multiway cuts (extended abstract) , 1992, STOC '92.

[21]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[23]  Gerhard Winkler,et al.  Image analysis, random fields and dynamic Monte Carlo methods: a mathematical introduction , 1995, Applications of mathematics.

[24]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[25]  A. Frigessi,et al.  Fast Approximate Maximum a Posteriori Restoration of Multicolour Images , 1995 .

[26]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Ingemar J. Cox,et al.  A maximum-flow formulation of the N-camera stereo correspondence problem , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[28]  Davi Geiger,et al.  Occlusions, Discontinuities, and Epipolar Lines in Stereo , 1998, ECCV.

[29]  Carlo Tomasi,et al.  A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Olga Veksler,et al.  Markov random fields with efficient approximations , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[31]  Peter J. W. Rayner,et al.  Unsupervised image segmentation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[32]  Stan Birchfield,et al.  Depth and motion discontinuities , 1999 .

[33]  R. Zabih,et al.  Efficient Graph-Based Energy Minimization Methods in Computer Vision , 1999 .

[34]  Richard Szeliski,et al.  An Experimental Comparison of Stereo Algorithms , 1999, Workshop on Vision Algorithms.

[35]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[36]  H. Ishikawa Global Optimization Using Embedded Graphs , 2000 .

[37]  Olga Veksler,et al.  Image segmentation by nested cuts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[38]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Vladimir Kolmogorov,et al.  Computing visual correspondence with occlusions using graph cuts , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[40]  Michael Werman,et al.  Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization , 2001, IEEE Trans. Pattern Anal. Mach. Intell..