Recovering Relative Depth from Low-Level Features Without Explicit T-junction Detection and Interpretation

This work presents a novel computational model for relative depth order estimation from a single image based on low-level local features that encode perceptual depth cues such as convexity/concavity, inclusion, and T-junctions in a quantitative manner, considering information at different scales. These multi-scale features are based on a measure of how likely is a pixel to belong simultaneously to different objects (interpreted as connected components of level sets) and, hence, to be occluded in some of them, providing a hint on the local depth order relationships. They are directly computed on the discrete image data in an efficient manner, without requiring the detection and interpretation of edges or junctions. Its behavior is clarified and illustrated for some simple images. Then the recovery of the relative depth order on the image is achieved by global integration of these local features applying a non-linear diffusion filtering of bilateral type. The validity of the proposed features and the integration approach is demonstrated by experiments on real images and comparison with state-of-the-art monocular depth estimation techniques.

[1]  Laxmi Parida,et al.  Junctions: Detection, Classification, and Reconstruction , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Stella X. Yu Angular embedding: From jarring intensity differences to perceived luminance , 2009, CVPR.

[3]  Yann Gousseau,et al.  Scales in Natural Images and a Consequence on their Bounded Variation Norm , 1999, Scale-Space.

[4]  R. von der Heydt,et al.  Coding of Border Ownership in Monkey Visual Cortex , 2000, The Journal of Neuroscience.

[5]  Rafael Grompone von Gioi,et al.  LSD: A Fast Line Segment Detector with a False Detection Control , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Frédo Durand,et al.  Flash photography enhancement via intrinsic relighting , 2004, SIGGRAPH 2004.

[7]  Stephen Gould,et al.  Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Mariella Dimiccoli,et al.  Monocular Depth by Nonlinear Diffusion , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[9]  Alex Pentland,et al.  Cooperative Robust Estimation Using Layers of Support , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  V. Caselles,et al.  Geometric Description of Images as Topographic Maps , 2009 .

[11]  Frédo Durand,et al.  A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach , 2006, ECCV.

[12]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[13]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[14]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .

[15]  Stella X. Yu,et al.  Angular embedding: From jarring intensity differences to perceived luminance , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  J. Moran,et al.  Sensation and perception , 1980 .

[17]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[18]  Philippe Salembier,et al.  Occlusion-based depth ordering on monocular images with Binary Partition Tree , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Michael Maire,et al.  Simultaneous Segmentation and Figure/Ground Organization Using Angular Embedding , 2010, ECCV.

[20]  Jean-Michel Morel,et al.  Fast Cartoon + Texture Image Filters , 2010, IEEE Transactions on Image Processing.

[21]  Mariella Dimiccoli,et al.  Exploiting T-junctions for depth segregation in single images , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Luis Álvarez,et al.  The Size of Objects in Natural and Artificial Images , 1999 .

[23]  Ashutosh Saxena,et al.  3-D Depth Reconstruction from a Single Still Image , 2007, International Journal of Computer Vision.

[24]  Leonid P. Yaroslavsky,et al.  Digital Picture Processing , 1985 .

[25]  Daphna Weinshall,et al.  Motion Segmentation and Depth Ordering Using an Occlusion Detector , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Jean-Michel Morel,et al.  A Kanizsa programme , 1996 .

[27]  Michael F. Cohen,et al.  Digital photography with flash and no-flash image pairs , 2004, ACM Trans. Graph..

[28]  Ferran Marqués,et al.  Region Merging Techniques Using Information Theory Statistical Measures , 2010, IEEE Transactions on Image Processing.

[29]  J. Wagemans,et al.  Switching dynamics of border ownership: A stochastic model for bi-stable perception , 2011, Vision Research.

[30]  Mary A Peterson,et al.  Inhibitory competition between shape properties in figure-ground perception. , 2008, Journal of experimental psychology. Human perception and performance.

[31]  David Mumford,et al.  Filtering, Segmentation and Depth , 1993, Lecture Notes in Computer Science.

[32]  Larry S. Davis,et al.  2009 IEEE 12th International Conference on Computer Vision (ICCV) , 2009 .

[33]  Lucas J. van Vliet,et al.  Separable bilateral filtering for fast video preprocessing , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[34]  Leonid P. Yaroslavsky,et al.  Digital Picture Processing: An Introduction , 1985 .

[35]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  Manish Singh,et al.  A Bayesian Framework for Figure-Ground Interpretation , 2010, NIPS.

[37]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, SIGGRAPH 2007.

[38]  Sang Hwa Lee,et al.  Real-time disparity estimation algorithm for stereo camera systems , 2011, IEEE Transactions on Consumer Electronics.

[39]  Mohamed R. Amer,et al.  Monocular Extraction of 2.1D Sketch , 2010, ICIP.

[40]  C. Strecha,et al.  Surface construction by a 2-D differentiation-integration process: a neurocomputational model for perceived border ownership, depth, and lightness in Kanizsa figures. , 2010, Psychological review.

[41]  Alexei A. Efros,et al.  Recovering Occlusion Boundaries from an Image , 2011, International Journal of Computer Vision.

[42]  Ronald A. Rensink,et al.  Early completion of occluded objects , 1998, Vision Research.

[43]  Jitendra Malik,et al.  Local figure-ground cues are valid for natural images. , 2007, Journal of vision.

[44]  Mariella Dimiccoli,et al.  Hierarchical region-based representation for segmentation and filtering with depth in single images , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[45]  Nong Sang,et al.  Bayesian Inference for Layer Representation with Mixed Markov Random Field , 2007, EMMCVPR.

[46]  Lance R. Williams,et al.  Stochastic Completion Fields: A Neural Model of Illusory Contour Shape and Salience , 1995, Neural Computation.

[47]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[48]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Yann Gousseau,et al.  Are Natural Images of Bounded Variation? , 2001, SIAM J. Math. Anal..

[50]  Subhasis Chaudhuri,et al.  Recovery of relative depth from a single observation using an uncalibrated (real-aperture) camera , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Jacob Feldman,et al.  Globally inconsistent figure/ground relations induced by a negative part. , 2009, Journal of vision.

[52]  Stefano Soatto,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE IEEE TRANSACTION OF PATTERN RECO , 2022 .

[53]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Yann Gousseau,et al.  The dead leaves model: a general tessellation modeling occlusion , 2006, Advances in Applied Probability.

[55]  Michael Lindenbaum,et al.  Boundary ownership by lifting to 2.1D , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[56]  Jean-Michel Morel,et al.  Topographic Maps and Local Contrast Changes in Natural Images , 1999, International Journal of Computer Vision.

[57]  Nava Rubin,et al.  Figure and ground in the brain , 2001, Nature Neuroscience.

[58]  I. Howard Perceiving in DepthVolume 3 Other Mechanisms of Depth Perception , 2012 .

[59]  David Mumford,et al.  The 2.1-D sketch , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[61]  Antonio Torralba,et al.  Depth Estimation from Image Structure , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Ruigang Yang,et al.  Spatial-Depth Super Resolution for Range Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  J. Gibson The Ecological Approach to Visual Perception , 1979 .