论文信息 - On the relevance of sparsity for image classification

On the relevance of sparsity for image classification

In this paper we empirically analyze the importance of sparsifying representations for classification purposes. We focus on those obtained by convolving images with linear filters, which can be either hand designed or learned, and perform extensive experiments on two important Computer Vision problems, image categorization and pixel classification. To this end, we adopt a simple modular architecture that encompasses many recently proposed models. The key outcome of our investigations is that enforcing sparsity constraints on features extracted in a convolutional architecture does not improve classification performance, whereas it does so when redundancy is artificially introduced. This is very relevant for practical purposes, since it implies that the expensive run-time optimization required to sparsify the representation is not always justified, and therefore that computational costs can be drastically reduced.

[1] Michael Elad,et al. From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[2] Mohamed-Jalal Fadili,et al. An overview of inverse problem regularization using sparsity , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[3] Ioannis A. Kakadiaris,et al. Automatic Centerline Extraction of Irregular Tubular Structures Using Probability Volumes from Multiphoton Imaging , 2007, MICCAI.

[4] Y-Lan Boureau,et al. Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[5] József Fiser,et al. No evidence for active sparsification in the visual cortex , 2009, NIPS.

[6] Guido Gerig,et al. 3D Multi-scale line filter for segmentation and visualization of curvilinear structures in medical images , 1997, CVRMed.

[7] Geoffrey E. Hinton. Learning to represent visual input , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[8] Martial Hebert,et al. Toward Objective Evaluation of Image Segmentation Algorithms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Tong Zhang,et al. Improved Local Coordinate Coding using Local Tangents , 2010, ICML.

[10] Rajat Raina,et al. Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[11] Guillermo Sapiro,et al. Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[12] Luca Maria Gambardella,et al. Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.

[13] J L Gallant,et al. Sparse coding and decorrelation in primary visual cortex during natural vision. , 2000, Science.

[14] Vincent Lepetit,et al. Filter Learning for Linear Structure Segmentation , 2011 .

[15] Quoc V. Le,et al. Tiled convolutional neural networks , 2010, NIPS.

[16] Vaidotas Marozas,et al. Ranking of color space components for detection of blood vessels in eye fundus images , 2009 .

[17] Alejandro F. Frangi,et al. Muliscale Vessel Enhancement Filtering , 1998, MICCAI.

[18] Thomas S. Huang,et al. Image processing , 1971 .

[19] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[20] Vivek Mehta,et al. Automated Tracing of Neurites from Light Microscopy Stacks of Images , 2011, Neuroinformatics.

[21] M. Meilă. Comparing clusterings---an information based distance , 2007 .

[22] E. Candès,et al. Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[23] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[24] Rangasami L. Kashyap,et al. Building Skeleton Models via 3-D Medial Surface/Axis Thinning Algorithms , 1994, CVGIP Graph. Model. Image Process..

[25] Max W. K. Law,et al. Three Dimensional Curvilinear Structure Detection Using Optimally Oriented Flux , 2008, ECCV.

[26] Julien Mairal,et al. Convex optimization with sparsity-inducing norms , 2011 .

[27] Guillermo Sapiro,et al. Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28] Michael Elad,et al. L1-L2 Optimization in Signal and Image Processing , 2010, IEEE Signal Processing Magazine.

[29] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[30] Nicolas Pinto,et al. Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[31] E Meijering,et al. Design and validation of a tool for neurite tracing and analysis in fluorescence microscopy images , 2004, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[32] Chih-Jen Lin,et al. Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[33] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[34] Badrinath Roysam,et al. Robust 3-D Modeling of Vasculature Imagery Using Superellipsoids , 2007, IEEE Transactions on Medical Imaging.

[35] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[36] Gang Hua,et al. Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Andrew Y. Ng,et al. The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[38] Geoffrey E. Hinton,et al. Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39] Guillermo Sapiro,et al. Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[40] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[41] Rajesh P. N. Rao,et al. Bilinear Sparse Coding for Invariant Vision , 2005, Neural Computation.

[42] Charles V. Stewart,et al. Retinal Vessel Centerline Extraction Using Multiscale Matched Filters, Confidence and Edge Measures , 2006, IEEE Transactions on Medical Imaging.

[43] Jitendra Malik,et al. Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[44] Zhenghao Chen,et al. On Random Weights and Unsupervised Feature Learning , 2011, ICML.

[45] Gang Hua,et al. Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[46] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.

[47] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[48] J. van Pelt,et al. Analysis of tubular structures in three-dimensional confocal images , 2002, Network.

[49] Thomas Serre,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50] Graham W. Taylor,et al. Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[51] Jean Ponce,et al. Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[53] Geoffrey E. Hinton,et al. Generating more realistic images using gated MRF's , 2010, NIPS.

[54] Graham W. Taylor,et al. Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55] Nicholas Ayache,et al. Model-Based Detection of Tubular Structures in 3D Images , 2000, Comput. Vis. Image Underst..

[56] Vincent Lepetit,et al. DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[58] D. L. Donoho,et al. Compressed sensing , 2006, IEEE Trans. Inf. Theory.

[59] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[60] Armen Stepanyants,et al. Detection of the optimal neuron traces in confocal microscopy images , 2009, Journal of Neuroscience Methods.

[61] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.

[62] L. Abbott,et al. Responses of neurons in primary and inferior temporal visual cortices to natural scenes , 1997, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[63] Guillermo Sapiro,et al. Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[64] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[65] Wen Gao,et al. Group-sensitive multiple kernel learning for object categorization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[66] Hsuan-Tien Lin. A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods , 2005 .

[67] Gady Agam,et al. Probabilistic modeling based vessel enhancement in thoracic CT scans , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[68] Aapo Hyvärinen,et al. Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[69] Khalid A. Al-Kofahi,et al. Rapid automated three-dimensional tracing of neurons from confocal image stacks , 2002, IEEE Transactions on Information Technology in Biomedicine.

[70] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[71] Vincent Lepetit,et al. Are sparse representations really relevant for image classification? , 2011, CVPR 2011.

[72] Max A. Viergever,et al. Ridge-based vessel segmentation in color images of the retina , 2004, IEEE Transactions on Medical Imaging.

[73] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[74] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[75] Michael Elad,et al. On the Role of Sparse and Redundant Representations in Image Processing , 2010, Proceedings of the IEEE.

[76] Guido Gerig,et al. Three-dimensional multi-scale line filter for segmentation and visualization of curvilinear structures in medical images , 1998, Medical Image Anal..

[77] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[78] Jitendra Malik,et al. Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[79] Anca Dima,et al. Automatic segmentation and skeletonization of neurons from confocal microscopy images based on the 3-D wavelet transform , 2002, IEEE Trans. Image Process..

[80] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[81] P. Fua,et al. Learning rotational features for filament detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[82] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[83] Dorin Comaniciu,et al. A learning based hierarchical model for vessel segmentation , 2008, 2008 5th IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[84] I. Daubechies,et al. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[85] Michael Scholz,et al. New methods for the computer-assisted 3-D reconstruction of neurons from confocal image stacks , 2004, NeuroImage.