Efficient classification using salient regions

Classification of images in many categorized datasets has rapidly improved in recent years. However, methods that perform well on particular datasets typically have one or more limitations, such as insufficient image-transformation invariance or significant performance degradation as the number of classes is increased. We attempt to overcome these challenges by extracting and matching visual features only at the focuses of visual saliency instead of the entire scene. First, we propose a visual-saliency detection method that combines the respective merits of color-saliency boosting and global-region-based contrast schemes to achieve more accurate saliency maps. Using a single feature type, we then obtain good performance on three public datasets when compared to other state-of-the-art approaches. Overall, our results prove that robust and efficient fixation-based classification, in terms of reducing the complexity of feature extraction, is possible.

[1]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Özgür Ulusoy,et al.  A histogram-based approach for object-based query-by-shape-and-color in image and video databases , 2005, Image Vis. Comput..

[3]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[5]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[6]  Joost van de Weijer,et al.  Boosting color saliency in image feature detection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[8]  Andriana Olmos,et al.  A biologically inspired algorithm for the recovery of shading and reflectance images , 2004 .

[9]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  J. H. Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998 .

[11]  Zhi Liu,et al.  Saliency detection using multiple region-based features , 2011 .

[12]  Wonjun Kim,et al.  Saliency detection via textural contrast. , 2012, Optics letters.

[13]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[14]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[15]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[16]  Erkki Oja,et al.  Efficient Variant of Algorithm FastICA for Independent Component Analysis Attaining the CramÉr-Rao Lower Bound , 2006, IEEE Transactions on Neural Networks.

[17]  Liqing Zhang,et al.  Dynamic visual attention: searching for coding length increments , 2008, NIPS.

[18]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[19]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[20]  Eric O. Postma,et al.  Toward a Visual Cognitive System Using Active Top-Down Saccadic Control , 2008, Int. J. Humanoid Robotics.

[21]  Garrison W. Cottrell,et al.  Robust classification of objects, faces, and flowers using natural image statistics , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Ming Zhang,et al.  Improving the Graph-Based Image Segmentation Method , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[23]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[24]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Janet Hui-wen Hsiao,et al.  NIMBLE: a kernel density model of saccade-based visual memory. , 2008, Journal of vision.

[26]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[28]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[31]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[32]  Ramon Baldrich,et al.  Saliency of color image derivatives: a comparison between computational models and human perception. , 2010, Journal of the Optical Society of America. A, Optics, image science, and vision.

[33]  Lucas Paletta,et al.  Q-learning of sequential attention for visual object recognition from informative local descriptors , 2005, ICML.