Image Segmentation Using Subspace Representation and Sparse Decomposition

Image foreground extraction is a classical problem in image processing and vision, with a large range of applications. In this dissertation, we focus on the extraction of text and graphics in mixed-content images, and design novel approaches for various aspects of this problem. We first propose a sparse decomposition framework, which models the background by a subspace containing smooth basis vectors, and foreground as a sparse and connected component. We then formulate an optimization framework to solve this problem, by adding suitable regularizations to the cost function to promote the desired characteristics of each component. We present two techniques to solve the proposed optimization problem, one based on alternating direction method of multipliers (ADMM), and the other one based on robust regression. Promising results are obtained for screen content image segmentation using the proposed algorithm. We then propose a robust subspace learning algorithm for the representation of the background component using training images that could contain both background and foreground components, as well as noise. With the learnt subspace for the background, we can further improve the segmentation results, compared to using a fixed subspace. Lastly, we investigate a different class of signal/image decomposition problem, where only one signal component is active at each signal element. In this case, besides estimating each component, we need to find their supports, which can be specified by a binary mask. We propose a mixed-integer programming problem, that jointly estimates the two components and their supports through an alternating optimization scheme. We show the application of this algorithm on various problems, including image segmentation, video motion segmentation, and also separation of text from textured images.

[1]  Xiaochun Cao,et al.  Total Variation Regularized RPCA for Irregularly Moving Object Detection Under Dynamic Background , 2016, IEEE Transactions on Cybernetics.

[2]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[3]  Gongping Yang,et al.  -Means Based Fingerprint Segmentation with Sensor Interoperability , 2010, EURASIP J. Adv. Signal Process..

[4]  Shervin Minaee,et al.  Masked Signal Decomposition Using Subspace Representation and Its Applications , 2017, ArXiv.

[5]  Stanley Osher,et al.  A Low Patch-Rank Interpretation of Texture , 2013, SIAM J. Imaging Sci..

[6]  Yunde Jia,et al.  Spatio-temporal patches for night background modeling by subspace learning , 2008, 2008 19th International Conference on Pattern Recognition.

[7]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[8]  Yi Ma,et al.  TILT: Transform Invariant Low-Rank Textures , 2010, ACCV.

[9]  Laura Balzano,et al.  Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Baocai Yin,et al.  Screen Content Coding Based on HEVC Framework , 2014, IEEE Transactions on Multimedia.

[11]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[12]  H. Andrews,et al.  Hadamard transform image coding , 1969 .

[13]  Loong Fah Cheong,et al.  Block-Sparse RPCA for Salient Motion Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ivan W. Selesnick,et al.  Three Dimensional Data-Driven Multi Scale Atomic Representation of Optical Coherence Tomography , 2015, IEEE Transactions on Medical Imaging.

[15]  Yoshua Bengio,et al.  High quality document image compression with "DjVu" , 1998, J. Electronic Imaging.

[16]  Tao Xu,et al.  A compressed sensing approach for underdetermined blind audio source separation with sparse representation , 2009, 2009 IEEE/SP 15th Workshop on Statistical Signal Processing.

[17]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[18]  D. Donoho,et al.  Redundant Multiscale Transforms and Their Application for Morphological Component Separation , 2004 .

[19]  Hui Cheng,et al.  Document compression using rate-distortion optimized segmentation , 2001, J. Electronic Imaging.

[20]  Charles A. Bouman,et al.  Text Segmentation for MRC Document Compression , 2011, IEEE Transactions on Image Processing.

[21]  Jan-Michael Frahm,et al.  A Comparative Analysis of RANSAC Techniques Leading to Adaptive Real-Time Random Sample Consensus , 2008, ECCV.

[22]  Deanna Needell,et al.  Stable Image Reconstruction Using Total Variation Minimization , 2012, SIAM J. Imaging Sci..

[23]  Pengwei Hao,et al.  Compound image compression for real-time computer screen image transmission , 2005, IEEE Transactions on Image Processing.

[24]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[25]  Allen Y. Yang,et al.  Sparse representation of images with hybrid linear models , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[26]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[29]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[30]  D. Donoho,et al.  Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA) , 2005 .

[31]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[32]  Zhixun Su,et al.  Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation , 2011, NIPS.

[33]  Shervin Minaee,et al.  Image segmentation using overlapping group sparsity , 2016, 2016 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[34]  Richard Baraniuk,et al.  The Dual-tree Complex Wavelet Transform , 2007 .

[35]  Chunheng Wang,et al.  Sparse representation for face recognition based on discriminative low-rank dictionary learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[37]  Shervin Minaee,et al.  Screen Content Image Segmentation Using Robust Regression and Sparse Decomposition , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[38]  I. Daubechies,et al.  Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[39]  Shervin Minaee,et al.  Palmprint Recognition Using Deep Scattering Convolutional Network , 2016, ArXiv.

[40]  C. Ballantine On the Hadamard product , 1968 .

[41]  Volkan Cevher,et al.  A variational approach to stable principal component pursuit , 2014, UAI.

[42]  Andrew B. Watson,et al.  Image Compression Using the Discrete Cosine Transform , 1994 .

[43]  Shervin Minaee,et al.  Palmprint recognition using deep scattering network , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[44]  Ping Wang,et al.  Robust image hashing based on low-rank and sparse decomposition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Shervin Minaee,et al.  An experimental study of deep convolutional features for iris recognition , 2016, 2016 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[47]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[48]  Rémi Gribonval,et al.  Sparse Representations in Audio and Music: From Coding to Source Separation , 2010, Proceedings of the IEEE.

[49]  Tao Tao,et al.  Iterative online subspace learning for robust image alignment , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[50]  Kun Huang,et al.  Multiscale Hybrid Linear Models for Lossy Image Representation , 2006, IEEE Transactions on Image Processing.

[51]  Shervin Minaee,et al.  Screen content image segmentation using sparse decomposition and total variation minimization , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[52]  Narendra Ahuja,et al.  Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-Rank Matrices , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[56]  P. Nagabhushan,et al.  Foreground text segmentation in complex color document images using Gabor filters , 2012, Signal Image Video Process..

[57]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[58]  Thomas S. Huang,et al.  Supervised translation-invariant sparse coding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[59]  H. Nussbaumer Fast Fourier transform and convolution algorithms , 1981 .

[60]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[61]  Tong Zhang,et al.  Transient Artifact Reduction Algorithm (TARA) Based on Sparse Optimization , 2014, IEEE Transactions on Signal Processing.

[62]  Jean-Yves Tourneret,et al.  Hyperspectral and Multispectral Image Fusion Based on a Sparse Representation , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[63]  Michael Elad,et al.  Submitted to Ieee Transactions on Image Processing Image Decomposition via the Combination of Sparse Representations and a Variational Approach , 2022 .

[64]  I. Daubechies Ten Lectures on Wavelets , 1992 .

[65]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[66]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[67]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[68]  Stefanos Zafeiriou,et al.  Efficient Online Subspace Learning With an Indefinite Kernel for Visual Tracking and Recognition , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[69]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[70]  Thierry Bouwmans,et al.  Traditional and recent approaches in background modeling for foreground detection: An overview , 2014, Comput. Sci. Rev..

[71]  Ming Xu,et al.  Mixed raster content (MRC) model for compound image compression , 1998, Electronic Imaging.

[72]  John Wright,et al.  RASL: Robust Alignment by Sparse and Low-Rank Decomposition for Linearly Correlated Images , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[73]  Touradj Ebrahimi,et al.  The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[74]  H. Abdi,et al.  Principal component analysis , 2010 .

[75]  Walter Gander,et al.  Gram‐Schmidt orthogonalization: 100 years and more , 2013, Numer. Linear Algebra Appl..

[76]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization with Application to Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Charles R. Johnson,et al.  Topics in matrix analysis: The Hadamard product , 1991 .

[78]  Joel A. Tropp,et al.  Robust Computation of Linear Models by Convex Relaxation , 2012, Foundations of Computational Mathematics.

[79]  Xinjian Chen,et al.  Automatic Liver Segmentation Based on Shape Constraints and Deformable Graph Cut in CT Images , 2015, IEEE Transactions on Image Processing.

[80]  Ebroul Izquierdo,et al.  Foreground Segmentation via Dynamic Tree-Structured Sparse RPCA , 2016, ECCV.

[81]  A. Izenman Linear Discriminant Analysis , 2013 .

[82]  David Zhang,et al.  Fisher Discrimination Dictionary Learning for sparse representation , 2011, 2011 International Conference on Computer Vision.

[83]  Weisi Lin,et al.  Scale and Orientation Invariant Text Segmentation for Born-Digital Compound Images , 2015, IEEE Transactions on Cybernetics.

[84]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[85]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[86]  Rui Wang,et al.  Scene Text Segmentation via Inverse Rendering , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[87]  Shervin Minaee,et al.  Screen content image segmentation using least absolute deviation fitting , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[88]  Xinjian Chen,et al.  Medical Image Segmentation by Combining Graph Cuts and Oriented Active Appearance Models , 2012, IEEE Transactions on Image Processing.

[89]  L. Atlas,et al.  Single-Channel Source Separation Using Complex Matrix Factorization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[90]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[91]  Xin Liu,et al.  Background subtraction based on low-rank and structured sparse decomposition. , 2015, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[92]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[93]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.