论文信息 - Learning inter-related visual dictionary for object recognition

Learning inter-related visual dictionary for object recognition

Object recognition is challenging especially when the objects from different categories are visually similar to each other. In this paper, we present a novel joint dictionary learning (JDL) algorithm to exploit the visual correlation within a group of visually similar object categories for dictionary learning where a commonly shared dictionary and multiple category-specific dictionaries are accordingly modeled. To enhance the discrimination of the dictionaries, the dictionary learning problem is formulated as a joint optimization by adding a discriminative term on the principle of the Fisher discrimination criterion. As well as presenting the JDL model, a classification scheme is developed to better take advantage of the multiple dictionaries that have been trained. The effectiveness of the proposed algorithm has been evaluated on popular visual benchmarks.

[1] Jean Ponce,et al. Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[3] Guillermo Sapiro,et al. Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4] José M. Bioucas-Dias,et al. A New TwIST: Two-Step Iterative Shrinkage/Thresholding Algorithms for Image Restoration , 2007, IEEE Transactions on Image Processing.

[5] Shuicheng Yan,et al. Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] Le Li,et al. SENSC: a Stable and Efficient Algorithm for Nonnegative Sparse Coding: SENSC: a Stable and Efficient Algorithm for Nonnegative Sparse Coding , 2009 .

[7] David Zhang,et al. Fisher Discrimination Dictionary Learning for sparse representation , 2011, 2011 International Conference on Computer Vision.

[8] S. Sastry,et al. A Review of Fast `1-Minimization Algorithms for Robust Face Recognition , 2010 .

[9] Kjersti Engan,et al. Frame based signal compression using method of optimal directions (MOD) , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[10] Baoxin Li,et al. Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11] M. Elad,et al. $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[12] Ke Huang,et al. Sparse Representation for Signal Classification , 2006, NIPS.

[13] Guillermo Sapiro,et al. Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Martial Hebert,et al. Discriminative Sparse Image Models for Class-Specific Edge Detection and Image Interpretation , 2008, ECCV.

[15] Thomas S. Huang,et al. Supervised translation-invariant sparse coding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17] Larry S. Davis,et al. Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[18] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20] David G. Stork,et al. Pattern classification, 2nd Edition , 2000 .

[21] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[22] Guillermo Sapiro,et al. Supervised Dictionary Learning , 2008, NIPS.

[23] Allen Y. Yang,et al. Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[25] Antonio Criminisi,et al. Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[26] Sebastian Nowozin,et al. On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27] Rong Jin,et al. Unifying discriminative visual codebook generation with classifier training for object category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Trevor Darrell,et al. Factorized Latent Spaces with Structured Sparsity , 2010, NIPS.

[29] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.

[30] Shang-Hong Lai,et al. Learning component-level sparse representation using histogram information for image classification , 2011, 2011 International Conference on Computer Vision.

[31] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Thomas G. Dietterich,et al. Learning non-redundant codebooks for classifying complex objects , 2009, ICML '09.

[33] Allen Y. Yang,et al. A Review of Fast l1-Minimization Algorithms for Robust Face Recognition , 2010, ArXiv.

[34] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.

[35] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.