Learning Low-level Vision Learning Low-level Vision

We describe a learning-based method for low-level vision problems{estimating scenes from images. We generate a synthetic world of scenes and their corresponding rendered images, modeling their relationships with a Markov network. Bayesian belief propagation allows us to e ciently nd a local maximum of the posterior probability for the scene, given an image. We call this approach VISTA{Vision by Image/Scene TrAining. We apply VISTA to the \super-resolution" problem (estimating high frequency details from a low-resolution image), showing good results. To illustrate the potential breadth of the technique, we also apply it in two other problem domains, both simpli ed. We learn to distinguish shading from re ectance variations in a single image under particular lighting conditions. For the motion estimation problem in a \blobs world", we show gure/ground discrimination, solution of the aperture problem, and lling-in arising from application of the same probabilistic machinery. To appear in: International Journal of Computer Vision, 2000. This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonpro t educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Information Technology Center America; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Information Technology Center America. All rights reserved. Copyright c Mitsubishi Electric Information Technology Center America, 2000 201 Broadway, Cambridge, Massachusetts 02139 1. First printing, TR2000-05, March, 2000. 1. revision, TR2000-05a, July, 2000. Egon Pasztor's present address: MIT Media Lab 20 Ames St. Cambridge, MA 02139 y Owen Carmichael's present address: Carnegie Mellon University Robotics Institute 5000 Forbes Avenue Pittsburgh, PA 15213 Learning Low-Level Vision William T. Freeman, Egon C. Pasztor, Owen T. Carmichael MERL, Mitsubishi Electric Research Labs.

[1]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[2]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  H. Barrow,et al.  Computational vision , 1981, Proceedings of the IEEE.

[4]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[5]  Josef Kittler,et al.  Relaxation labelling algorithms - a review , 1986, Image Vis. Comput..

[6]  Tomaso Poggio,et al.  Computational vision and regularization theory , 1985, Nature.

[7]  Thomas O. Binford,et al.  Bayesian inference in model-based machine vision , 1987, Int. J. Approx. Reason..

[8]  J. A. Anderson,et al.  Associative learning of scene parameters from images. , 1987, Applied optics.

[9]  Stuart German,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1988 .

[10]  T. Poggio,et al.  Synthesizing a color algorithm from examples. , 1988, Science.

[11]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[12]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[13]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[14]  Alex Pentland,et al.  A practical approach to fractal-based image compression , 1991, [1991] Proceedings. Data Compression Conference.

[15]  Alex Pentland,et al.  Practical approach to fractal-based image compression , 1991, Other Conferences.

[16]  Federico Girosi,et al.  Parallel and Deterministic Algorithms from MRFs: Surface Reconstruction , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Edward H. Adelson,et al.  Recovering reflectance and illumination in a world of painted polyhedra , 1993, 1993 (4th) International Conference on Computer Vision.

[18]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[19]  William T. Freeman,et al.  The generic viewpoint assumption in a framework for visual perception , 1994, Nature.

[20]  Robert L. Stevenson,et al.  A Bayesian approach to image expansion for improved definitio , 1994, IEEE Trans. Image Process..

[21]  M. Carandini,et al.  Summation and division by neurons in primate visual cortex. , 1994, Science.

[22]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[23]  James R. Bergen,et al.  Pyramid-based texture analysis/synthesis , 1995, Proceedings., International Conference on Image Processing.

[24]  Yair Weiss,et al.  Interpreting Images by Propagating Bayesian Beliefs , 1996, NIPS.

[25]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[26]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[27]  Eero P. Simoncelli Statistical models for images: compression, restoration and synthesis , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[28]  Edward H. Adelson,et al.  Belief Propagation and Revision in Networks with Loops , 1997 .

[29]  Paul A. Viola,et al.  Bayesian Model of Surface Perception , 1997, NIPS.

[30]  Song-Chun Zhu,et al.  Prior Learning and Gibbs Reaction-Diffusion , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[32]  Michael I. Jordan,et al.  Probabilistic Independence Networks for Hidden Markov Probability Models , 1997, Neural Computation.

[33]  Paul A. Viola,et al.  Texture recognition using a non-parametric multi-scale statistical model , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[34]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[35]  Brendan J. Frey,et al.  Graphical Models for Machine Learning and Digital Communication , 1998 .

[36]  Brendan J. Frey,et al.  Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models , 1998, IEEE J. Sel. Areas Commun..

[37]  William T. Freeman,et al.  Learning to Estimate Scenes from Images , 1998, NIPS.

[38]  Eric Saund,et al.  Perceptual organization of occluding contours generated by opaque surfaces , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[39]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[40]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[41]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.