Learning Joint Intensity-Depth Sparse Representations

This paper presents a method for learning overcomplete dictionaries of atoms composed of two modalities that describe a 3D scene: 1) image intensity and 2) scene depth. We propose a novel joint basis pursuit (JBP) algorithm that finds related sparse features in two modalities using conic programming and we integrate it into a two-step dictionary learning algorithm. The JBP differs from related convex algorithms because it finds joint sparsity models with different atoms and different coefficient values for intensity and depth. This is crucial for recovering generative models where the same sparse underlying causes (3D features) give rise to different signals (intensity and depth). We give a bound for recovery error of sparse coefficients obtained by JBP, and show numerically that JBP is superior to the group lasso algorithm. When applied to the Middlebury depth-intensity database, our learning algorithm converges to a set of related features, such as pairs of depth and intensity edges or image textures and depth slants. Finally, we show that JBP outperforms state of the art methods on depth inpainting for time-of-flight and Microsoft Kinect 3D data.

[1]  Michael Lehmann,et al.  An all-solid-state optical range camera for 3D real-time imaging with sub-centimeter depth resolution (SwissRanger) , 2004, SPIE Optical Systems Design.

[2]  Masayuki Tanimoto,et al.  Multiview Imaging and 3DTV , 2007, IEEE Signal Processing Magazine.

[3]  J. Tropp Algorithms for simultaneous sparse approximation. Part II: Convex relaxation , 2006, Signal Process..

[4]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[6]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[7]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[8]  Yonina C. Eldar,et al.  Dictionary Optimization for Block-Sparse Representations , 2010, IEEE Transactions on Signal Processing.

[9]  Stephen J. Wright,et al.  Simultaneous Variable Selection , 2005, Technometrics.

[10]  Sergey Bakin,et al.  Adaptive regression and model selection in data mining problems , 1999 .

[11]  Bruno A. Olshausen,et al.  Learning Sparse Representations of Depth , 2010, IEEE Journal of Selected Topics in Signal Processing.

[12]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[13]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[14]  T. Tsuchiya A Convergence Analysis of the Scaling-invariant Primal-dual Path-following Algorithms for Second-ord , 1998 .

[15]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[16]  M. Nikolova An Algorithm for Total Variation Minimization and Applications , 2004 .

[17]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[18]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[19]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[20]  Dr.-Ing. Thorsten Ringbeck A 3 D TIME OF FLIGHT CAMERA FOR OBJECT DETECTION , 2007 .

[21]  Erling D. Andersen,et al.  On implementing a primal-dual interior-point method for conic quadratic optimization , 2003, Math. Program..

[22]  Brian Potetz,et al.  Scene Statistics and 3D Surface Perception , 2011 .

[23]  Marc E. Pfetsch,et al.  Exact and Approximate Sparse Solutions of Underdetermined Linear Equations , 2008, SIAM J. Sci. Comput..

[24]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[25]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .